上次 寫一篇有關於 Lucene.net–搜尋數字範圍問題以及暫時解答 之後..
有一位前輩 sholfen 給了我一個 關鍵字提示 NumericField
我上網查了一下文件,果然這就是我要的東西.. 在也不用利用小技巧來解決數字的問題了.. OH~Ya..
感謝 sholfen 大大~果然寫 blog 也可以學東西..
資料概述
1~1200 Id,Age 欄位也就是 1~120011001~12000 id,Age 欄位也就是 11001~12000
結構為
{
"Id":"9",
"Memo":"當麻左手凌空劈出,右掌跟著迅捷之極的劈出,左手掌力先發後到,右手掌力後發先到,兩股力道交錯而前,詭異之極",
"Birthday":"1900-01-10T00:00:00",
"Age":9
}
建立數字索引欄位
之前建立索引方式為
Field f_Age = new Field("Age", ds["Age"].ToString(), Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.NO);
這時候我們要使用 NumericField 來建立..
NumericField f_Age = new NumericField("Age", Field.Store.YES, true);
f_Age.SetIntValue(int.Parse(ds["Age"].ToString()));
C# Code:
Stopwatch sw = new Stopwatch();
// 讀取所有資料
var di = new DirectoryInfo(AppDomain.CurrentDomain.BaseDirectory + "\\Source\\");
sw.Start();
var allObjects = di.GetFiles().Select(
x => JObject.Parse((File.ReadAllText(x.FullName)))).ToArray();
//Index 存放路徑
string indexPath = AppDomain.CurrentDomain.BaseDirectory + "\\Index5\\";
FSDirectory dir = FSDirectory.Open(new DirectoryInfo(indexPath));
//IndexWriter
IndexWriter indexWriter = new IndexWriter(dir, new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29), true, IndexWriter.MaxFieldLength.UNLIMITED);
// 還原且加入需做 index 的欄位
foreach (JObject ds in allObjects)
{
Document doc = new Document();
// 把每一個欄位都建立索引
Field f_Id = new Field("Id", ds["Id"].ToString(), Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.NO);
Field f_Memo = new Field("Memo", ds["Memo"].ToString(), Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.NO);
Field f_BirthDay = new Field("BirthDay", DateTime.Parse(ds["Birthday"].ToString()).ToString("yyyyMMdd"), Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.NO);
// 建立數字型索引欄位
NumericField f_Age = new NumericField("Age", Field.Store.YES, true);
f_Age.SetIntValue(int.Parse(ds["Age"].ToString()));
doc.Add(f_Id); doc.Add(f_Age); doc.Add(f_Memo); doc.Add(f_BirthDay);
indexWriter.AddDocument(doc);
}
indexWriter.Optimize();
indexWriter.Commit();
indexWriter.Close();
sw.Stop();
Response.Write("建立" + allObjects.Length + "筆索引花費時間: " + sw.Elapsed + "");
搜尋數字範圍
這時候搜尋用 NumericRangeQuery
// 搜尋 Age 範圍
NumericRangeQuery nquery = NumericRangeQuery.NewIntRange("Age", 11, 20, true, true);
C# Code :
// 啟用監看
Stopwatch sw = new Stopwatch();
sw.Start();
// 讀取索引
string indexPath = AppDomain.CurrentDomain.BaseDirectory.ToString() + "\\Index5\\";
DirectoryInfo dirInfo = new DirectoryInfo(indexPath);
FSDirectory dir = FSDirectory.Open(dirInfo);
IndexSearcher search = new IndexSearcher(dir, true);
// 針對 Memo 欄位進行搜尋
QueryParser parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_29, "Age", new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29));
// 搜尋 Age 範圍
NumericRangeQuery nquery = NumericRangeQuery.NewIntRange("Age", 11, 20, true, true);
Sort sort = new Sort(new SortField("Id", 4));
// 開始搜尋
var hits = search.Search(nquery, null, search.MaxDoc(), sort).ScoreDocs;
sw.Stop();
Response.Write("花費時間:" + sw.Elapsed + "<br /><hr />");
Response.Write("資料比數:" + hits.Length + "<br /><hr />");
Response.Write("Result:<br />");
foreach (var res in hits)
{
Response.Write("Id:" + search.Doc(res.doc).Get("Id") + " BirthDay=" + search.Doc(res.doc).Get("BirthDay") + " Memo=" + search.Doc(res.doc).Get("Memo") + "<br />");
}
結果
OY~ YA~~
終於不用再用用那轉換方法 來做到 ..^^
Source:
參考文章:
http://sholfen.pixnet.net/blog/post/42417709
http://stackoverflow.com/questions/7866376/lucene-searching-for-a-numeric-value-field