Our team has done a lot of work over the past few years to bring the Entity Framework up as an enterprise-ready ORM, but there is a still a lot of work for us to do going forward, particularly in the area of object flexibility. Even though with POCO entities we allow some customization when it comes to collection types, there are many more scenarios that we don’t support out of the box, at least without some workarounds.
One of these scenarios is to use a collection of scalar values (like ints or strings) to represent a relationship, instead of a collection of very simple entity types, each of which has that scalar property. The reason you’d want to do this is because you want to persist the scalar values to the database, but there isn’t any additional information associated with those values to justify a full-fledged entity type. The Entity Framework doesn’t support this today, but in this post I’ll take you through how you can simulate this with your entities.
In this post I’m using the Database First approach to using EF, but I’m sure you can also achieve this same thing in Model First and Code First.
The Model
The domain for this post focuses on albums and songs, and in this simplified model, we are only interested in the name of the song for a particular album.
The conceptual model in EF does not look much different; the only difference from the default generation is that there is no navigation property from Song to Album—that is, there is no Album property on Song.
Everything seems straightforward so far. We’ve decided, though, that since the only property of interest on the Song class is the SongTitle, that the Album class should have a collection of song titles instead of a collection of Songs. I’ll show how this works in the next step.
The Model (Code)
Now unfortunately because we are hacking around the way the Entity Framework works, we have to make some compromises when it comes to the API we expose on the Album class. Ideally, I would like to write something like this:
[sourcecode language="csharp"]
public class Album
{
public Album()
{
this.SongTitles = new HashSet<string>();
}
public int Id { get; set; }
public string AlbumName { get; set; }
public ICollection<string> SongTitles { get; private set; }
}
[/sourcecode]
Now for this to work with EF I have a couple of requirements:
- When I query for Albums, the SongTitles must be populated with titles from the related Song entities.
- When I add or remove SongTitles from an Album that is attached to the ObjectContext, this must be processed as an Add or Delete during the call to SaveChanges.
- Creating a new Album, populating the list of SongTitles, and then calling AddObject must also ensure that its SongTitles are added to the database when SaveChanges is called.
Let’s take this one step at a time.
Query
When you issue a query to the database, the tuples of data that come back are converted into objects. This process is called materialization. EF4 allows us to use the ObjectMaterialized event to discover when objects are materialized, and we can use this to load the song titles for an album when any Album instance is materialized. Before we get started, though, we need a Song class to query for and a derived ObjectContext class.
[sourcecode language="csharp"]
public class Song
{
public int AlbumId { get; set; }
public string SongTitle { get; set; }
}
[/sourcecode]
[sourcecode language="csharp"]
public class ScalarCollectionsContext : ObjectContext
{
public ScalarCollectionsContext() :
base("name=EFScalarCollectionEntities")
{
}
public ObjectSet<Album> Albums
{
get { return this.CreateObjectSet<Album>(); }
}
public ObjectSet<Song> Songs
{
get { return this.CreateObjectSet<Song>(); }
}
}
[/sourcecode]
Since the ObjectMaterialized event is exposed on the ObjectContext, we can subscribe to that event from within the ScalarCollectionsContext constructor. The implementation is fairly straightforward—once an Album is materialized, query for the Songs that are associated with that Album and add all of the song titles in the results to the Album’s SongTitles collection.
[sourcecode language="csharp"]
private void OnObjectMaterialized(object sender, ObjectMaterializedEventArgs e)
{
var album = e.Entity as Album;
if (album != null)
{
foreach (var songTitle in this.Songs.Where(s => s.AlbumId == album.Id).Select(s => s.SongTitle))
{
album.SongTitles.Add(songTitle);
}
}
}
[/sourcecode]
One step done.
Adding/Removing SongTitles Processed during SaveChanges
The second step is to make sure that any time we add, remove, or change an item in the SongTitles collection, the corresponding add or delete occurs in the data store. Now you could do this by taking a snapshot of the collection when you query for it and then diffing it with the collection when you save changes, but to make things a little simpler we will instead leverage an ObservableCollection<string> for the SongTitles collection.
Each Album instance can then subscribe to the CollectionChanged event of that collection and register the changes as adds or removes against a private navigation property for Songs. We use the navigation property to make it easier to bridge the gap between our objects and the change tracking capabilities built into the Entity Framework. If that’s unclear, I’ve included the new version of the class below. Note that the SongTitles collection is now an ObservableCollection whose changes update the private Songs collection.
[sourcecode language="csharp" padlinenumbers="true"]
public class Album
{
public Album()
{
this.Songs = new HashSet<Song>();
this.SongTitles = new ObservableCollection<string>();
this.SongTitles.CollectionChanged += OnSongTitlesChanged;
}
public int Id { get; set; }
public string AlbumName { get; set; }
public ObservableCollection<string> SongTitles { get; private set; }
private ICollection<Song> Songs { get; set; }
private void OnSongTitlesChanged(object sender, NotifyCollectionChangedEventArgs e)
{
if (e.NewItems != null)
{
foreach (string title in e.NewItems)
{
this.Songs.Add(new Song() { AlbumId = this.Id, SongTitle = title });
}
}
if (e.OldItems != null)
{
foreach (string title in e.OldItems)
{
var song = this.Songs.SingleOrDefault(s => s.SongTitle == title);
this.Songs.Remove(song);
}
}
if (e.Action == NotifyCollectionChangedAction.Reset)
{
this.Songs.Clear();
}
}
}
[/sourcecode]
It would seem that we are done. When we query for Albums, their related SongTitles are populated, which also populates the Songs collection since it is an observable collection. Any changes to the SongTitles collection will update the corresponding Songs collection, and the Entity Framework will use that for assessing what changes to the database need to be made. Finally, because we subscribe to the CollectionChanged event from within the Album’s constructor, if we create an Album outside of the context and then add/attach it, its corresponding children will be added or attached as well.
But, there are a couple of problems:
- Marking the Songs collection private will not work in medium trust.
- When the SongTitles are populated after querying for Albums, completely new Songs are added to the Songs collection, instead of the existing Songs that were materialized as part of the LINQ query. This means when SaveChanges is called, the Entity Framework thinks that it needs to INSERT all the existing songs as new songs, causing a primary key violation in Songs table.
While the first problem is not something we can fix outside of the framework today, the second one is, and it requires a few changes to the code we have for the ObjectMaterialized event.
[sourcecode language="csharp"]
private void OnObjectMaterialized(object sender, ObjectMaterializedEventArgs e)
{
var album = e.Entity as Album;
if (album != null)
{
EntityCollection<Song> songs = (EntityCollection<Song>)this.ObjectStateManager
.GetRelationshipManager(album)
.GetRelatedEnd("EFScalarCollectionModel.FK_Song_Album", "Song");
foreach (var song in songs.CreateSourceQuery())
{
album.SongTitles.Add(song.SongTitle);
}
}
}
[/sourcecode]
Welcome to the nasty side of the Entity Framework, where magic strings abound.
This code does a couple of very useful things to help us achieve our goal, so let’s break it down step-by-step:
First we retrieve the EF-centric view of the collection of Songs for the Album that was just materialized. This is an EntityCollection<Song>, a class you may recognize if you have used the Entity Framework with the default code generation in the past. The first magic string passed to GetRelatedEnd method is the namespace-qualified Association name of the relationship between Song and Album; the second is the name of the Role that signifies which "end" of the relationship you want to retrieve. You can find this in the CSDL section of your EDMX file:
[sourcecode language="xml"] <AssociationSet Name="FK_Song_Album" Association="EFScalarCollectionModel.FK_Song_Album"> <End Role="Album" EntitySet="Albums" /> <End Role="Song" EntitySet="Songs" /> </AssociationSet> [/sourcecode]
Next we iterate over all the Songs in the collection after calling CreateSourceQuery; this allows us to both to load the Songs collection in the Album instance and to populate the SongTitles collection in the same Album instance.
One last problem—we now have duplicate Songs in the collection because SongTitles.Add triggers the CollectionChanged event. But we can fix this simply by ensuring that we don’t add duplicates in the OnCollectionChanged handler. Note the addition of the Where filter on the NewItems collection.
[sourcecode language="csharp"]
if (e.NewItems != null)
{
foreach (string title in e.NewItems.Cast<string>().Where(t => !this.Songs.Any(s => s.SongTitle == t)))
{
this.Songs.Add(new Song() { AlbumId = this.Id, SongTitle = title });
}
}
[/sourcecode]
And that’s it. This is one way to keep collection of scalars in your POCO entities but project something entirely different to the Entity Framework. Theoretically, this would also work with collections of complex types, although I have not tried it.
Results
If you’re really concerned about a clear separation of concerns, this probably isn’t the greatest solution for you, since there are a lot of concerns that bleed from the Album entity because of the Entity Framework. Granted, the extra properties and methods are all private, but it’s still code you need to wade through every time you make changes to that part of the model.
Now, I have not tried this due to a lack of time, but there might be a way to remove the Songs navigation property entirely if you don’t care about the Song entities being in the state manager at all. If I wanted to do this, I would keep the following things in mind.
- As a consumer of the Album class, all I care about are the SongTitles, so I can make any changes I want to them, including adds, removes, and changes.
- When I call SaveChanges on the ObjectContext, I would need a way to replay changes made to the SongTitles collection against the ObjectContext in terms it understands i.e. the Song entities.
- This means that it would be useful to have a collection that had an initial state and a log of all the changes made to it.
- The initial state is different for Albums created by users versus those created by the Entity Framework due to queries.
There are also some limitations with this approach; for example, Song cannot have any other scalar properties or associations.
I have attached a solution with the final code in both C# and Visual Basic, along with a small number of acceptance tests that verify the requirements I laid out at the beginning of the post. If you decide to implement the alternative where there is no navigation property from Albums to Songs, then you can use these acceptance tests to help you get started.
If there are any other types of "hacks" you’d like to see with the Entity Framework let me know in a comment!