# EFCore.BulkExtensions **Repository Path**: CC1027CC/EFCore.BulkExtensions ## Basic Information - **Project Name**: EFCore.BulkExtensions - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2021-04-01 - **Last Updated**: 2021-04-01 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # EFCore.BulkExtensions EntityFrameworkCore extensions:
-Bulk operations (**Insert, Update, Delete, Read, Upsert, Sync, Truncate**) and
-Batch ops (**Delete, Update**).
Library is Lightweight and very Efficient, having all mostly used CRUD operation.
Was selected in top 20 [EF Core Extensions](https://docs.microsoft.com/en-us/ef/core/extensions/) recommended by Microsoft.
Current version is using EF Core 3.1 and at the moment supports Microsoft SQL Server(2008+) and SQLite.
It is targeting NetStandard 2.0 so it can be used on project targeting NetCore(2.0+) or NetFramework(4.6.1+).
Versions between 3.1.0 and 3.0.0 are using EF Core 3.0 and targeting NetStandard 2.1 so could only be on NetCore(3.0+).
Versions before 3.0, last 2.6.4, are targeting NetStandard 2.0 and can be used with NetCore(2.2) or NetFramework(4.6.1+).
EFCore/v.Nuget: EFCore2.1/v2.4.1 EFCore2.0/v2.0.8, and for EF Core 1.x use 1.1.0 (targeting NetStandard 1.4)
Under the hood uses [SqlBulkCopy](https://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx) for Insert, for Update/Delete combines BulkInsert with raw Sql [MERGE](https://docs.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql).
For SQLite there is no BulkCopy, instead library uses plain SQL combined with [UPSERT](https://www.sqlite.org/lang_UPSERT.html).
Bulk Tests can not have UseInMemoryDb because InMemoryProvider does not support Relational-specific methods. Available on [![NuGet](https://img.shields.io/nuget/v/EFCore.BulkExtensions.svg)](https://www.nuget.org/packages/EFCore.BulkExtensions/) latest version.
Package manager console command for installation: *Install-Package EFCore.BulkExtensions* ## Contributing If you find this project useful you can mark it by leaving a Github **\*Star**.
If you would like to support the Project by making a Donation ($10) *#BuyMeBeer*, you are welcome to do so:
[![Donate](https://img.shields.io/badge/donate-PayPal-yellow.svg)](https://www.paypal.me/BorisDjurdjevic/10) or [![Donate Bitcoin](https://img.shields.io/badge/donate-Bitcoin-orange.svg)](https://borisdj.github.io/pages/donateBTC.html)
Please read [CONTRIBUTING](CONTRIBUTING.md) for details on code of conduct, and the process for submitting pull requests.
[![NuGet](https://img.shields.io/npm/l/express.svg)](https://github.com/borisdj/EFCore.BulkExtensions/blob/master/LICENSE)
Want to **Contact** us for Hire (Development & Consulting): [www.codis.tech](http://www.codis.tech) ## Usage It's pretty simple and straightforward.
**Bulk** Extensions are made on *DbContext* class and can be used like this (both regular and Async methods are supported): ```C# context.BulkInsert(entitiesList); context.BulkInsertAsync(entitiesList); context.BulkInsertOrUpdate(entitiesList); context.BulkInsertOrUpdateAsync(entitiesList); //Upsert context.BulkInsertOrUpdateOrDelete(entitiesList); context.BulkInsertOrUpdateOrDeleteAsync(entitiesList);//Sync context.BulkUpdate(entitiesList); context.BulkUpdateAsync(entitiesList); context.BulkDelete(entitiesList); context.BulkDeleteAsync(entitiesList); context.BulkRead(entitiesList); context.BulkReadAsync(entitiesList); context.Truncate(); context.TruncateAsync(); ``` **Batch** Extensions are made on *IQueryable* DbSet and can be used as in the following code segment.
They are done as pure sql and no check is done whether some are prior loaded in memory and are being Tracked.
(*updateColumns* is optional parameter in which PropertyNames added explicitly when we need update to it's default value) ```C# // Delete context.Items.Where(a => a.ItemId > 500).BatchDelete(); context.Items.Where(a => a.ItemId > 500).BatchDeleteAsync(); // Update (using Expression arg.) supports Increment/Decrement context.Items.Where(a => a.ItemId <= 500).BatchUpdate(a => new Item { Quantity = a.Quantity + 100 }); // can be as value '+100' or as variable '+incrementStep' (int incrementStep = 100;) // Update (via simple object) context.Items.Where(a => a.ItemId <= 500).BatchUpdate(new Item { Description = "Updated" }); context.Items.Where(a => a.ItemId <= 500).BatchUpdateAsync(new Item { Description = "Updated" }); // Update (via simple object) - requires additional Argument for setting to Property default value var updateColumns = new List { nameof(Item.Quantity) }; // Update 'Quantity' to default value('0'-zero) var q = context.Items.Where(a => a.ItemId <= 500); int affected = q.BatchUpdate(new Item { Description = "Updated" }, updateColumns);//result assigned to variable ``` ## Bulk info If Windows Authentication is used then in ConnectionString there should be *Trusted_Connection=True;* because Sql credentials are required to stay in connection.
When used directly each of these operations are separate transactions and are automatically committed.
And if we need multiple operations in single procedure then explicit transaction should be used, for example:
```C# using (var transaction = context.Database.BeginTransaction()) { context.BulkInsert(entities1List); context.BulkInsert(entities2List); transaction.Commit(); } ``` **BulkInsertOrUpdate** method can be used when there is need for both operations but in one connection to database.
It makes Update when PK(PrimaryKey) is matched, otherwise does Insert.
**BulkInsertOrUpdateOrDelete** effectively [synchronizes](https://www.mssqltips.com/sqlservertip/1704/using-merge-in-sql-server-to-insert-update-and-delete-at-the-same-time/) table rows with input data.
Those in Db that are not found in the list will be deleted.
**BulkRead** does SELECT and JOIN based on one or more Unique columns that are specified in Config `UpdateByProperties`.
More info in the [Example](https://github.com/borisdj/EFCore.BulkExtensions#read-example) at the bottom.
Note: Bulk ops have optional argument *Type type* that can be set to type of Entity if list has dynamic runtime objects or is inhereted from Entity class. ## BulkConfig arguments **BulkInsert_/OrUpdate/OrDelete** methods can have optional argument **BulkConfig** with properties (bool, int, object, List):
{ *PreserveInsertOrder, SetOutputIdentity, BatchSize, NotifyAfter, BulkCopyTimeout, EnableStreaming, UseTempDB, UniqueTableNameTempDb, CustomDestinationTableName, TrackingEntities, WithHoldlock, CalculateStats, PropertiesToInclude, PropertiesToIncludeOnCompare, PropertiesToExclude, PropertiesToIncludeOnUpdate, PropertiesToExcludeOnCompare, PropertiesToExcludeOnUpdate, UpdateByProperties, EnableShadowProperties, IncludeGraph, OmitClauseExistsExcept, DoNotUpdateIfTimeStampChanged, SRID, SqlBulkCopyOptions }*
Default behaviour is
{ *PreserveInsertOrder*: true, *SetOutputIdentity*: false, *BatchSize*: 2000, *NotifyAfter*: null, *BulkCopyTimeout*: null,
*EnableStreaming*: false, *UseTempDB*: false, *UniqueTableNameTempDb*: true, *CustomDestinationTableName*: null, *TrackingEntities*: false, *WithHoldlock*: true,
*CalculateStats*: false, *PropertiesToInclude*: null, *PropertiesToIncludeOnCompare* : null, *PropertiesToIncludeOnUpdate* : null, *PropertiesToExclude*: null, *PropertiesToExcludeOnCompare*: null, *PropertiesToExcludeOnUpdate*: null, *UpdateByProperties*: null, *EnableShadowProperties*: false, *IncludeGraph*: false, *OmitClauseExistsExcept*: false, *DoNotUpdateIfTimeStampChanged*: false, *SRID*: 4326, *SqlBulkCopyOptions*: Default }

If we want to change defaults, BulkConfig should be added explicitly with one or more bool properties set to true, and/or int props like **BatchSize** to different number.
Config also has DelegateFunc for setting *Underlying-Connection/Transaction*, e.g. in UnderlyingTest.
When doing update we can chose to exclude one or more properties by adding their names into **PropertiesToExclude**, or if we need to update less then half column then **PropertiesToInclude** can be used. Setting both Lists are not allowed. When using the **BulkInsert_/OrUpdate** methods, you may also specify the **PropertiesToIncludeOnCompare** and **PropertiesToExcludeOnCompare** properties. By adding a column name to the *PropertiesToExcludeOnCompare*, will allow it to be inserted and updated but will not update the row if any of the other columns in that row did not change. For example, if you are importing bulk data and want to keep an internal *CreateDate* or *UpdateDate*, you add those columns to the *PropertiesToExcludeOnCompare*.
Another option that may be used in the same scenario are the **PropertiesToIncludeOnUpdate** and **PropertiesToExcludeOnUpdate** properties. These properties will allow you to specify insert-only columns such as *CreateDate* and *CreatedBy*. If we want Insert only new and skip existing ones in Db (Insert_if_not_Exist) then use *BulkInsertOrUpdate* with config `PropertiesToIncludeOnUpdate = new List { "" }` Additionaly there is **UpdateByProperties** for specifying custom properties, by which we want update to be done.
Using UpdateByProperties while also having Identity column requires that Id property be [Excluded](https://github.com/borisdj/EFCore.BulkExtensions/issues/131).
If **NotifyAfter** is not set it will have same value as _BatchSize_ while **BulkCopyTimeout** when not set has SqlBulkCopy default which is 30 seconds and if set to 0 it indicates no limit.

_SetOutputIdentity_ have purpose only when PK has Identity (usually *int* type with AutoIncrement), while if PK is Guid(sequential) created in Application there is no need for them.
Also Tables with Composite Keys have no Identity column so no functionality for them in that case either. ```C# var bulkConfig = new BulkConfig { SetOutputIdentity = true, BatchSize = 4000 }; context.BulkInsert(entList, bulkConfig); context.BulkInsertOrUpdate(entList, new BulkConfig { SetOutputIdentity = true }); context.BulkInsertOrUpdate(entList, b => b.SetOutputIdentity = true); // example of BulkConfig set with Action arg. ``` **PreserveInsertOrder** is **true** by default and makes sure that entites are inserted to Db as ordered in entitiesList.
When table has Identity column (int autoincrement) with 0 values in list they will temporary be automatically changed from 0s into range -N:-1.
Or it can be manually set with proper values for order (Negative values used to skip conflict with existing ones in Db).
Here single Id value itself doesn't matter, db will change it to next in sequence, what matters is their mutual relationship for sorting.
Insertion order is implemented with [TOP](https://docs.microsoft.com/en-us/sql/t-sql/queries/top-transact-sql) in conjuction with ORDER BY. [stackoverflow:merge-into-insertion-order](https://stackoverflow.com/questions/884187/merge-into-insertion-order).
This config should remain true when *SetOutputIdentity* is set to true on Entity containing NotMapped Property. [issues/76](https://github.com/borisdj/EFCore.BulkExtensions/issues/76)
When using **SetOutputIdentity** Id values will be updated to new ones from database.
With BulkInsertOrUpdate for those that will be updated it has to match with Id column, or other unique column(s) if using UpdateByProperties.
**SetOutputIdentity** is useful when BulkInsert is done to multiple related tables, that have Identity column.
After Insert is done to first table, we need Id-s (if using Option 1) that were generated in Db because they are FK(ForeignKey) in second table.
It is implemented with [OUTPUT](https://docs.microsoft.com/en-us/sql/t-sql/queries/output-clause-transact-sql) as part of MERGE Query, so in this case even the Insert is not done directly to TargetTable but to TempTable and then Merged with TargetTable.
When used Id-s will be updated in entitiesList, and if *PreserveInsertOrder* is set to *false* then entitiesList will be cleared and reloaded.
Example of *SetOutputIdentity* with parent-child FK related tables: ```C# int numberOfEntites = 1000; var entities = new List(); var subEntities = new List(); for (int i = 1; i <= numberOfEntites; i++) { var entity = new Item { Name = $"Name {i}" }; entity.ItemHistories = new List() { new ItemHistory { Remark = $"Info {i}.1" }, new ItemHistory { Remark = $"Info {i}.2" } }; entities.Add(entity); } // Option 1 using (var transaction = context.Database.BeginTransaction()) { context.BulkInsert(entities, new BulkConfig { SetOutputIdentity = true }); foreach (var entity in entities) { foreach (var subEntity in entity.ItemHistories) { subEntity.ItemId = entity.ItemId; // setting FK to match its linked PK that was generated in DB } subEntities.AddRange(entity.ItemHistories); } context.BulkInsert(subEntities); transaction.Commit(); } // Option 2 using Graph (only for SQL Server) - all entities in relationship with main ones in list are BulkInsertUpdated context.BulkInsert(entities, b => b.IncludeGraph = true); ``` When **CalculateStats** is set to True the result is return in `BulkConfig.StatsInfo` (*StatsNumber-Inserted/Updated/Deleted*).
If used for pure Insert (with Batching) then SetOutputIdentity should also be configured because Merge have to be used.
**TrackingEntities** can be set to True if we want to have tracking of entities from BulkRead or if SetOutputIdentity is set.
**UseTempDB** when set then BulkOperation has to be [inside Transaction](https://github.com/borisdj/EFCore.BulkExtensions/issues/49)
**UniqueTableNameTempDb** when changed to true temp table name will be only 'Temp' without random numbers.
**CustomDestinationTableName** can be set with 'TableName' only or with 'Schema.TableName'.
**EnableShadowProperties** to add (normal) Shadow Property and persist value. Disables automatic discrimator, use manual method.
**IncludeGraph** when set all entites that have relations with main ones from the list are also merged into theirs tables.
**OmitClauseExistsExcept** removes the clause from Merge statement, useful when need to active triggers even for same data.
**DoNotUpdateIfTimeStampChanged** if set checks TimeStamp for Concurrency, ones with conflict will [not be updated](https://github.com/borisdj/EFCore.BulkExtensions/issues/469#issuecomment-803662721).
**SRID** Spatial Reference Identifier - for SQL Server with NetTopologySuite. **SqlBulkCopyOptions** is Enum with [[Flags]](https://stackoverflow.com/questions/8447/what-does-the-flags-enum-attribute-mean-in-c) attribute which enables specifying one or more options:
*Default, KeepIdentity, CheckConstraints, TableLock, KeepNulls, FireTriggers, UseInternalTransaction* Last optional argument is **Action progress** (Example in *EfOperationTest.cs* *RunInsert()* with *WriteProgress()*). ```C# context.BulkInsert(entitiesList, null, (a) => WriteProgress(a)); ``` Library supports [Global Query Filters](https://docs.microsoft.com/en-us/ef/core/querying/filters) and [Value Conversions](https://docs.microsoft.com/en-us/ef/core/modeling/value-conversions) as well. BatchUpdate and named Property with [EnumToString Conversion](https://github.com/borisdj/EFCore.BulkExtensions/issues/397).
It also maps [OwnedTypes](https://docs.microsoft.com/en-us/ef/core/modeling/owned-entities), which is implemented with `DataTable` class.
With [Computed](https://docs.microsoft.com/en-us/ef/core/modeling/relational/computed-columns) and [Timestamp](https://docs.microsoft.com/en-us/ef/core/modeling/concurrency) Columns it will work in a way that they are automatically excluded from Insert. And when combined with *SetOutputIdentity* they will be Selected.
[Spatial](https://docs.microsoft.com/en-us/sql/relational-databases/spatial/spatial-data-types-overview?view=sql-server-ver15) types, like Geometry, also supported and if Entity has one, clause *EXIST ... EXCEPT* it skipped because it's not comparable.
Performance for bulk ops measured with `ActivitySources` named: '*EFCore.BulkExtensions.BulkExecute*' (tags: '*operationType*', '*entitiesCount*')
Bulk Extension methods can be [Overridden](https://github.com/borisdj/EFCore.BulkExtensions/issues/56) if required, for example to set AuditInfo.
If having problems with Deadlock there is useful info in [issue/46](https://github.com/borisdj/EFCore.BulkExtensions/issues/46). ## TPH inheritance When having TPH ([Table-Per-Hierarchy](https://docs.microsoft.com/en-us/aspnet/core/data/ef-mvc/inheritance)) inheritance model it can be set in 2 ways.
First is automatically by Convention in which case Discriminator column is not directly in Entity but is [Shadow](http://www.learnentityframeworkcore.com/model/shadow-properties) Property.
And second is to explicitly define Discriminator property in Entity and configure it with `.HasDiscriminator()`.
Important remark regarding the first case is that since we can not set directly Discriminator to certain value we need first to add list of entities to DbSet where it will be set and after that we can call Bulk operation. Note that SaveChanges are not called and we could optionally turn off TrackingChanges for performance. Example: ```C# public class Student : Person { ... } context.Students.AddRange(entities); // adding to Context so that Shadow property 'Discriminator' gets set context.BulkInsert(entities); ``` ## Read example When we need to Select from big List of some Unique Prop./Column use BulkRead (JOIN done in Sql) for Efficiency:
```C# // instead of WhereIN which will TimeOut for List with several thousand records var entities = context.Items.Where(a => itemsNames.Contains(a.Name)).AsNoTracking().ToList(); //SQL IN operator // or JOIN in Memory that loads entire table var entities = context.Items.Join(itemsNames, a => a.Name, p => p, (a, p) => a).AsNoTracking().ToList(); // USE var items = itemsNames.Select(a => new Item { Name = a }); // creating list of Items where only Name is set var bulkConfig = new BulkConfig { UpdateByProperties = new List { nameof(Item.Name) } }; context.BulkRead(items, bulkConfig); // Items list will be loaded from Db with data(other properties) ``` ## Performances Following are performances (in seconds for SQL Server): | Operations\Rows | 100,000 EF | 100,000 EFBulk | 1,000,000 EFBulk | | --------------- | ---------: | -------------: | ---------------: | | Insert | 38.98 s | 2.10 s | 17.99 s | | Update | 109.25 s | 3.96 s | 31.45 s | | Delete | 7.26 s | 2.04 s | 12.18 s | |-----------------|------------|----------------|------------------| | **Together** | 70.70 s | 5.88 s | 56.84 s | TestTable has 6 columns (Guid, string, string, int, decimal?, DateTime).
All were inserted and 2 of them (string, DateTime) were updated.
Test was done locally on following configuration: INTEL Core i5-3570K 3.40GHz, DDRIII 8GB x 2, SSD 840 EVO 128 GB.
For small data sets there is an overhead since most Bulk ops need to create Temp table and also Drop it after finish. Probably good advice would be to use Bulk ops for sets greater than 1000.