Lucene.Net – LINQ to Lucene using Entity Framework

//Lucene.Net – LINQ to Lucene using Entity Framework

Lucene.Net – LINQ to Lucene using Entity Framework

By |2012-04-26T14:58:01+00:00April 26th, 2012|Categories: Programming|9 Comments

What is Lucene ?

Lucene is an Information Retrieval System used for full-text searching and scoring results commonly referred to as a search engine (similar to google-type-searching).

Lucene.Net is a port of the Lucene search engine library, written in C# and targeted at .NET runtime users. The Lucene search library is based on an inverted index.

LINQ is Microsoft’s extensible Language Integrated Query platform that provides querying directly in a managed CLR. LINQ comes out of the box with LINQ to Objects, LINQ to DataSets, LINQ to XML, LINQ to SQL and LINQ to Entities. The goal of LINQ to Lucene is to provide developers with the ability to enjoy full-text searching using a fast-proven search-engine within the .Net managed CLR.

What is LINQ to Lucene?

CodePlex providing a custom LINQ solution for the Lucene Information Retrieval System, commonly referred to as a search-engine.

Since the format of Lucene, which stores its data in an Index as Documents with fields, the project will be capable of mapping custom business object types, a kind of Object-Index-Mapper (OIM) similar to the typical Object-Relational-Mapper (ORM) that LINQ to SQL provides. LINQ to Lucene uses attributes and reflection on classes to be indexed and their members, similar to the Table and Column decorators for SQL mapped objects. Types can easily support LINQ to SQL decorations for a larger number of members while co-supporting LINQ to Lucene decorations for the few members that support it. Classes stored in the lucene index are decorated with the [Document] attribute and its properties are decorated with the [Field] attribute. lets start practically.

Create Search Solution

Mainly there are two faces to a search solution , Indexing the data you wish to search on it, and actually searching the content.

First create a project and add the required references. We are going to use Lucene.Net.dll and Lucene.Linq.dll and finally the EntityFramework.dll.

Using Entity Framework capability “Reverse Engineer Code First” we are going to create our data context – we will use Northwind as a model , simply r-click on your project , choose Entity Framework , Reverse Engineer Code First , it will show a dialog to connect to your model database.

Entity Framework will do all the jobs for you to create all mapping and entity classes.

Now we got the NorthwindContext which represent the DbContext for the Northwind database. As we mentioned above we need to make IndexContext for the indexed data so we will create NorthwindIndexContext to manage our indexed data

public partial class NorthwindIndexContext : EfDatabaseIndexSet<NorthwindContext>

Notice it must inherit from EfDatabaseIndexSet as we use Entity Framework DbContext. NorthwindIndexContext will hold all the entities that will be indexed and ready for search just like the NorthwindContext for the DbContext. Now we must specify the Entities that will be indexed, we will take the Customer entity as example.

[Document]
public class Customer : IIndexable, INotifyPropertyChanging, INotifyPropertyChanged

As we mentioned above , to specify an entity for indexing you have to decorate the entity class by [Document] attribute , this attribute will tell Lucene that this entity is ready for indexing and searching. Other requirements is to Implement IIndexable Interface , INotifyPropertyChanging and INotifyPropertyChanged.

The next step is to specify fields that you would like to index.

[Field(FieldIndex.Tokenized, FieldStore.Yes)]
public string CustomerID { get; set; }
[Field(FieldIndex.Tokenized, FieldStore.Yes, IsDefault = true)]
public string CompanyName { get; set; }
[Field(FieldIndex.Tokenized, FieldStore.Yes, Name = "ContactName")]
public string ContactName { get; set; }
[Field(FieldIndex.Tokenized, FieldStore.Yes)]
public string ContactTitle { get; set; }

The [Field] attribute is used for telling Lucene that field will be indexable and searchable, it has many parameters I will let you discover.

Back to the NorthwindIndexContext and add the property for the Customer entity

 #region Properties

public IIndex<Customer> Customers
{
   get { return Get<Customer>(); }
}

Now we are ready to write our program

private static NorthwindIndexContext _index;
private static void Main(string[] args)
{
     const string path = @"C:\temp\index";
     try
     {
          Directory.Delete(path, true);
     }
     catch (Exception ex)
     {
          Console.WriteLine(ex);
     }
     Directory.CreateDirectory(path);
     Console.WriteLine("Northwind Db index:" + path);

     _index = new NorthwindIndexContext(path, new NorthwindContext());

     // add all the customers to the index
     _index.Write<Customer>();

     SimpleDemo();
     Console.WriteLine("Press any key to continue...");
     Console.ReadLine();
}

private static void SimpleDemo()
{
    IQueryable<Customer> query = from c in
    _index.Customers where c.ContactTitle == "Owner"
    select c;

    Console.WriteLine("Simple Query:");
    ObjectDumper.Write(query);
    Console.WriteLine("Query Output: {0}", query);
    Console.WriteLine();

    List<Customer> results = query.ToList();
    Console.WriteLine("SimpleDemo returned {0} results", results.Count);
}

In the previous code we specified the directory where Lucene will index the data in, instantiate the NorthwindIndexContext giving the path and the NorthwindContext , now we can index Customer entity by calling

_index.Write<Customer>()

The SimpleDemo method is an example for how to query from the indexed data , you will notice that it’s look similar to Linq and Entity querying.

There is a lot of features and solutions in Lucene you can discover it in the sample code.

Download sample

References
http://linqtolucene.codeplex.com/
http://lucene.apache.org/core/
http://incubator.apache.org/lucene.net/

About the Author:

9 Comments

  1. jonathan 29th September 2012 at 3:09 pm - Reply

    Hi – great article. Can you give a few pointers on how to implement the interfaces ( IIndexable , INotifyPropertyChanging and INotifyPropertyChanged)?

    Thanks!

    • Mohamed Farrag 1st October 2012 at 1:07 pm - Reply

      Hi Jonathan ,
      About the IIndexable interface, it is just required to inherit from to tell Lucene that this Entity can be indexed, so when the engine called it can index.
      The INotifyPropertyChanged, INotifyPropertyChanging interfaces are used to notify clients, typically binding clients, that a property value has changed/changing. it raises a PropertyChanged/PropertyChanging events when a particular property changed/changing – you can use it for example to detect a property change to re-index the entity. you can read about it deeply in http://msdn.microsoft.com/en-us/library/system.componentmodel.inotifypropertychanged.aspx
      Thanks !

  2. Joseph 8th February 2013 at 12:21 pm - Reply

    Can I use it with database first entity framework?

    • Mohamed Farrag 9th February 2013 at 2:48 am - Reply

      Sure Joseph , the attached example is database fist model using “Reverse Engineering” tool.
      Regards

      • Joseph 9th February 2013 at 2:16 pm - Reply

        But how it can be possible? I see, that you add attributes to entities. These attributes will be removed when database first entity framework will update entities. Sorry for bad english(I’m from Ukraine:).

        • Mohamed Farrag 10th February 2013 at 11:21 am - Reply

          I see what you meant , yes in fact you have to put your annotations again if you re-generate your database model.
          You are welcome anytime 🙂

          • Joseph 11th February 2013 at 10:54 pm

            That’s too bad:(

  3. AmyH 25th February 2013 at 6:47 pm - Reply

    Greetings,
    Looking at the sample code snippets, it looks like you are working on the Join and Nested methods. I’m attempting to join more than one entity (table) but am not successful. I attempted the usage of IQueryable, but seeing that one can only use one ‘T’, I’m not sure how to do the joins. I’m also a C# newbie, so any help is much appreciated! 🙂

  4. Joseph 11th June 2013 at 8:54 am - Reply

    I have a question again. Can I make part of query to DB context, by filtering by foreign table values or something else and then just search by Lucene Linq? A how does it looks like? Thanks!

Leave A Comment