±à¼ÍƼö: |
±¾ÎÄÀ´×ÔÓÚCSDN,±¾ÎÄ×ܽáÁËElasticsearchµÄ»ù´¡ÖªÊ¶¼°Àý×Ó¡¢¼Ü¹¹¡¢ÊµÏÖϸ½ÚºÍÆäËû²¹³ä¡£ |
|
Ò»¡¢»ù´¡ÖªÊ¶
ElasticsearchÊÇÃæÏòÎĵµ(document oriented)µÄ£¬ÕâÒâζ×ÅËü¿ÉÒÔ´æ´¢Õû¸ö¶ÔÏó»òÎĵµ(document)¡£È»¶øËü²»½ö½öÊÇ´æ´¢£¬»¹»áË÷Òý(index)ÿ¸öÎĵµµÄÄÚÈÝʹ֮¿ÉÒÔ±»ËÑË÷¡£ÔÚElasticsearchÖУ¬Äã¿ÉÒÔ¶ÔÎĵµ£¨¶ø·Ç³ÉÐгÉÁеÄÊý¾Ý£©½øÐÐË÷Òý¡¢ËÑË÷¡¢ÅÅÐò¡¢¹ýÂË¡£ÕâÖÖÀí½âÊý¾ÝµÄ·½Ê½ÓëÒÔÍùÍêÈ«²»Í¬£¬ÕâÒ²ÊÇElasticsearchÄܹ»Ö´Ðи´ÔÓµÄÈ«ÎÄËÑË÷µÄÔÒòÖ®Ò»¡£
1¡¢Àý×Ó
ÎÒÃÇÀ´¿´Ò»¸öʵ¼ÊµÄÀý×Ó£¬¼ÙÉèÓÐÈçϵÄÊý¾Ý£º

ÕâÀïÿһÐÐÊÇÒ»¸ödocument¡£Ã¿¸ödocument¶¼ÓÐÒ»¸ödocid¡£ÄÇô¸øÕâЩdocument½¨Á¢µÄµ¹ÅÅË÷Òý¾ÍÊÇ£º

¿ÉÒÔ¿´µ½£¬µ¹ÅÅË÷ÒýÊÇper fieldµÄ£¬Ò»¸ö×Ö¶ÎÓÉÒ»¸ö×Ô¼ºµÄµ¹ÅÅË÷Òý¡£18,20ÕâЩ½Ð×ö term£¬¶ø[1,3]¾ÍÊÇposting list¡£Posting list¾ÍÊÇÒ»¸öintµÄÊý×飬´æ´¢ÁËËùÓзûºÏij¸ötermµÄÎĵµid¡£
2¡¢¸ÅÄî
¿ÉÒÔ°Ñes¿´×öÊÇÃæÏòÎĵµµÄÊý¾Ý¿â£¬ËüÓë¹ØϵÐÍÊý¾Ý¿âµÄÃû´Ê¶ÔÕÕ¹ØϵÈçÏ£º
Relational DB => Databases => Tables => Rows => Columns
Elasticsearch => Index=> doc Types => Documents => Fields |
2.1¡¢Index Ë÷Òý
Ë÷Òý£¨index£©ÊÇElasticsearch¶ÔÂß¼Êý¾ÝµÄÂß¼´æ´¢£¬ËùÒÔËü¿ÉÒÔ·ÖΪ¸üСµÄ²¿·Ö¡£Äã¿ÉÒÔ°ÑË÷Òý¿´³É¹ØϵÐÍÊý¾Ý¿âµÄ±í¡£Elasticsearch¿ÉÒÔ°ÑË÷Òý´æ·ÅÔÚһ̨»úÆ÷»òÕß·ÖÉ¢ÔÚ¶ą̀·þÎñÆ÷ÉÏ£¬Ã¿¸öË÷ÒýÓÐÒ»»ò¶à¸ö·ÖƬ£¨shard£©£¬Ã¿¸ö·ÖƬ¿ÉÒÔÓжà¸ö¸±±¾£¨replica£©¡£
2.2¡¢doc Types ÎĵµÀàÐÍ
ÔÚElasticsearchÖУ¬Ò»¸öË÷Òý¶ÔÏó¿ÉÒÔ´æ´¢ºÜ¶à²»Í¬ÓÃ;µÄ¶ÔÏó¡£ÀýÈ磬һ¸ö²©¿ÍÓ¦ÓóÌÐò¿ÉÒÔ±£´æÎÄÕºÍÆÀÂÛ¡£ÎĵµÀàÐÍÈÃÎÒÃÇÇáÒ×µØÇø·Öµ¥¸öË÷ÒýÖеIJ»Í¬¶ÔÏó¡£Ã¿¸öÎĵµ¿ÉÒÔÓв»Í¬µÄ½á¹¹£¬µ«ÔÚʵ¼Ê²¿ÊðÖУ¬½«Îļþ°´ÀàÐÍÇø·Ö¶ÔÊý¾Ý²Ù×÷Óкܴó°ïÖú¡£
2.3¡¢ Document Îĵµ
´æ´¢ÔÚElasticsearchÖеÄÖ÷ҪʵÌå½ÐÎĵµ£¨document£©¡£ÓùØϵÐÍÊý¾Ý¿âÀ´Àà±ÈµÄ»°£¬Ò»¸öÎĵµÏ൱ÓÚÊý¾Ý¿â±íÖеÄÒ»ÐмǼ¡£´Ó¿Í»§¶ËµÄ½Ç¶È¿´£¬ÎĵµÊÇÒ»¸öJSON¶ÔÏó¡£Ã¿¸öÎĵµ´æ´¢ÔÚÒ»¸öË÷ÒýÖв¢ÓÐÒ»¸öElasticsearch×Ô¶¯Éú³ÉµÄΨһ±êʶ·ûºÍÎĵµÀàÐÍ¡£
2.4¡¢field
ÎĵµÓɶà¸öfield×é³É£¬´Ó¿Í»§¶ËµÄ½Ç¶È¿´£¬¾ÍÊÇjson¶ÔÏóÖеĶà¸ökv½Úµã¡£
¶þ¡¢¼Ü¹¹
1¡¢shard
ʵ¼ÊÉÏ£¬index½ö½öÖ»ÊÇÒ»¸öÃüÃû¿Õ¼äÀ´Ö¸ÏòÒ»¸ö»ò¶à¸öʵ¼ÊµÄÎïÀí·ÖƬ(shard)¡£¾ßÌåµÄÎïÀí·Ö²¼Á£¶È¹ØϵÈçÏ£º

Ò»¸öElasticsearch IndexÏ൱ÓÚÒ»¸öMySQLÀïµÄ±í£¬²»Í¬IndexµÄÊý¾ÝÊÇÎïÀíÉϸôÀ뿪À´µÄ¡£ElasticsearchµÄIndex»á·Ö³É¶à¸öShard´æ´¢£¬Ò»²¿·ÖShardÊÇReplica±¸·Ý¡£Ò»¸öShardÊÇÒ»·Ý±¾µØµÄ´æ´¢£¨Ò»¸ö±¾µØ´ÅÅÌÉϵÄĿ¼£©£¬Ò²¾ÍÊÇÒ»¸öLuceneµÄIndex¡£²»Í¬µÄShard¿ÉÄܻᱻ·ÖÅäµ½²»Í¬µÄÖ÷»ú½ÚµãÉÏ¡£Ò»¸öLucene Index»á´æ´¢ºÜ¶àµÄdoc£¬ÎªÁ˺ùÜÀí£¬Lucene°ÑLucene IndexÔÙ²ð³ÉÁËSegment´æ´¢£¨×ÓĿ¼£©¡£SegmentÄÚµÄdocÊýÁ¿ÉÏÏÞÊÇ2µÄ31´Î·½£¬ÕâÑùdoc id¾ÍÖ»ÐèÒªÒ»¸öint¾Í¿ÉÒÔ´æ´¢¡£Segment¶ÔÓ¦ÁËһЩÁÐÎļþ´æ´¢Ë÷Òý£¨µ¹ÅűíµÈ£©ºÍÖ÷´æ´¢£¨DocValuesµÈ£©£¬ÕâЩÎļþÄÚ²¿ÓÖ·ÖΪСµÄBlock½øÐÐѹËõ¡£
Ò»¸öshardʵ¼ÊÉÏÊÇÒ»¸öLuceneʵÀý£¬ÔÚËüµÄÄÜÁ¦·¶Î§ÄÚÓµÓÐÍêÕûµÄËÑË÷¹¦ÄÜ(ÔÚ´¦ÀíËü×Ô¼ºÓµÓеÄÊý¾ÝʱÓÐËùÓеŦÄÜ)¡£ÎÒÃÇËùÓÐÎĵµµÄË÷Òýindexed(¶¯´Ê)ºÍ´æ´¢¹¤×÷¶¼ÊÇÔÚshardÉÏ£¬µ«ÕâÊÇ͸Ã÷µÄ£¬ÎÒÃDz»ÐèÒªÖ±½ÓºÍshardͨÐÅ£¬¶øÊǺÍÎÒÃÇ´´½¨µÄindex(Ãû´Ê)ͨÐÅ¡£
shardsÊÇES½«Êý¾Ý·Ö²¼Ê½ÔÚÄãµÄ¼¯ÈºµÄ¹Ø¼ü¡£ÏëÏóÏÂshardsÊÇÊý¾ÝµÄÈÝÆ÷£¬Îĵµ´æ´¢ÔÚshardsÀ¶øshards±»·ÖÅäÔÚ¼¯ÈºµÄÿһ¸ö½ÚµãNodeÀï¡£µ±ÄãµÄ¼¯Èº¹æÄ£Ôö³¤ºÍ½µµÍʱ£¬ES»á×Ô¶¯µÄÔÚNodes¼äǨÒÆshardsÒÔ±£³Ö¼¯ÈºµÄ¸ºÔؾùºâ¡£
2¡¢±¸·Ý
shard¿É·ÖΪprimary shardºÍreplica shard¡£ ÔÚÒ»¸öindexÀïµÄÿһ¸öÎĵµ¶¼ÊôÓÚÒ»¸öµ¥¶ÀµÄprimary shard£¬ËùÒÔprimary shardµÄÊýÁ¿¾ö¶¨ÁËÄã×î´óÄÜ´æ´¢µÄÊý¾ÝÁ¿(¶ÔÓ¦ÓÚÒ»¸öindex)¡£
×¢Ò⣺shardÊǹéÊôÓëindexµÄ£¬¶ø²»ÊÇclusterµÄ¡£
replica shardÊÇprimary shardµÄ¿½±´¡£replicaÓÐÁ½¸ö×÷Ó㺠1.ÈßÓàÈÝÔÖ 2.Ìṩ¶ÁÇëÇó·þÎñ£¬ÀýÈçËÑË÷»ò¶ÁÈ¡Îĵµ
primary shardµÄÊýÁ¿ÔÚË÷Òý´´½¨Ê±È·¶¨ºó²»ÄÜÐ޸ģ¬replica¿ÉÒÔÔÚÈκÎʱºòÐ޸ġ£
3¡¢ShardsÎĵµÂ·ÓÉ
µ±Äã¶ÔÒ»¸öÎĵµ½¨Á¢Ë÷Òýʱ£¬Ëü½ö´æ´¢ÔÚÒ»¸öprimary shardÉÏ¡£ESÊÇÔõô֪µÀÒ»¸öÎĵµÓ¦¸ÃÊôÓÚÄĸöshard£¿µ±Äã´´½¨Ò»¸öеÄÎĵµÊ±£¬ESÊÇÔõô֪µÀÓ¦¸Ã°ÑËü´æ´¢ÖÁshard1»¹ÊÇshard2£¿ Õâ¸ö¹ý³Ì²»ÄÜËæ»úÎÞ¹æÂɵģ¬ÒòΪÒÔºóÎÒÃÇ»¹Òª½«ËüÈ¡³öÀ´¡£ËüµÄ·ÓÉËã·¨ÊÇ£º
shard = hash(routing) % numberofprimary_shards
routingµÄÖµ¿ÉÒÔÊÇÎĵµµÄid£¬Ò²¿ÉÒÔÊÇÓû§×Ô¼ºÉèÖõÄÒ»¸öÖµ¡£hash½«»á¸ù¾ÝroutingËã³öÒ»¸öÊýֵȻºó%primaryshardsµÄÊýÁ¿¡£ÕâÒ²ÊÇΪʲôprimary_shardsÔÚindex´´½¨Ê±¾Í²»ÄÜÐ޸ĵÄÔÒò¡£
ÎÒÃÇ¿ÉÒÔÏòÕâ¸ö¼¯ÈºµÄÈκÎһ̨NODE·¢ËÍÇëÇó£¬Ã¿Ò»¸öNODE¶¼ÓÐÄÜÁ¦´¦ÀíÇëÇó¡£Ã¿Ò»¸öNODE¶¼ÖªµÀÿһ¸öÎĵµËùÔÚµÄλÖÃËùÒÔ¿ÉÒÔÖ±½Ó½«ÇëÇó·ÓɹýÈ¥¡£ÏÂÃæµÄÀý×Ó£¬ÎÒÃǽ«ËùÓеÄÇëÇ󶼷¢Ë͵½NODE1¡£
4¡¢Ð´²Ù×÷
´´½¨¡¢Ë÷Òý¡¢É¾³ýÎĵµ¶¼ÊÇд²Ù×÷£¬ÕâЩ²Ù×÷±ØÐëÔÚprimary shardÍêÈ«³É¹¦ºó²ÅÄÜ¿½±´ÖÁÆä¶ÔÓ¦µÄreplicasÉÏ¡£

1.¿Í»§¶ËÏòNode1·¢ËÍд²Ù×÷µÄÇëÇó¡£
2.Node1ʹÓÃÎĵµµÄ_idÀ´¾ö¶¨Õâ¸öÎĵµÊôÓÚshard0£¬È»ºó½«ÇëÇó·ÓÉÖÁNODE3£¬P0ËùÔÚµÄλÖá£
3.Node3ÔÚP0ÉÏÖ´ÐÐÁËÇëÇó¡£Èç¹ûÇëÇó³É¹¦£¬Ôò½«ÇëÇó²¢ÐеÄ·ÓÉÖÁNODE1 NODE2µÄR0ÉÏ¡£µ±ËùÓеÄreplicas±¨¸æ³É¹¦ºó£¬NODE3ÏòÇëÇóµÄnode(NODE1)·¢Ëͳɹ¦±¨¸æ£¬NODE1ÔÙ±¨¸æÖÁClient¡£
µ±¿Í»§¶ËÊÕµ½Ö´Ðгɹ¦ºó£¬²Ù×÷ÒѾÔÚPrimary shardºÍËùÓеÄreplica shardsÉÏÖ´Ðгɹ¦ÁË¡£
µ±È»£¬ÓÐһЩÇëÇó²ÎÊý¿ÉÒÔÐÞ¸ÄÕâ¸öÂß¼¡£¼ûÔÎÄ¡£
5¡¢¶Á²Ù×÷
Ò»¸öÎĵµ¿ÉÒÔÔÚprimary shardºÍËùÓеÄreplica shardÉ϶ÁÈ¡¡£

¶Á²Ù×÷²½Ö裺
1.¿Í»§¶Ë·¢ËÍGetÇëÇóµ½NODE1¡£
2.NODE1ʹÓÃÎĵµµÄ_id¾ö¶¨ÎĵµÊôÓÚshard 0.shard 0µÄËùÓп½±´´æÔÚÓÚËùÓÐ3¸ö½ÚµãÉÏ¡£Õâ´Î£¬Ëü½«ÇëÇó·ÓÉÖÁNODE2¡£
3.NODE2½«Îĵµ·µ»Ø¸øNODE1£¬NODE1½«Îĵµ·µ»Ø¸ø¿Í»§¶Ë¡£ ¶ÔÓÚ¶ÁÇëÇó£¬ÇëÇó½Úµã(NODE1)½«ÔÚÿ´ÎÇëÇóµ½À´Ê±¶¼Ñ¡ÔñÒ»¸ö²»Í¬µÄreplica¡£
shardÀ´´ïµ½¸ºÔؾùºâ¡£Ê¹ÓÃÂÖѯ²ßÂÔÂÖѯËùÓеÄreplica shards¡£
6¡¢¸üвÙ×÷
¸üвÙ×÷£¬½áºÏÁËÒÔÉϵÄÁ½¸ö²Ù×÷£º¶Á¡¢Ð´¡£

²½Ö裺
1.¿Í»§¶Ë·¢Ë͸üвÙ×÷ÇëÇóÖÁNODE1
2.NODE1½«ÇëÇó·ÓÉÖÁNODE3£¬Primary shardËùÔÚµÄλÖÃ
3.NODE3´ÓP0¶ÁÈ¡Îĵµ£¬¸Ä±äsource×ֶεÄJSONÄÚÈÝ£¬È»ºóÊÔͼÖØжÔÐ޸ĺóµÄÊý¾ÝÔÚP0×öË÷Òý¡£Èç¹û´ËʱÕâ¸öÎĵµÒѾ±»ÆäËûµÄ½ø³ÌÐÞ¸ÄÁË£¬ÄÇôËü½«ÖØÐÂÖ´ÐÐ3²½Ö裬Õâ¸ö¹ý³ÌÈç¹û³¬¹ýÁËretryon_conflictÉèÖõĴÎÊý£¬¾Í·ÅÆú¡£
4.Èç¹ûNODE3³É¹¦¸üÐÂÁËÎĵµ£¬Ëü½«²¢ÐеĽ«Ð°汾µÄÎĵµÍ¬²½µ½NODE1ºÍNODE2µÄreplica shardsÖØн¨Á¢Ë÷Òý¡£Ò»µ©ËùÓеÄreplica
shards±¨¸æ³É¹¦£¬NODE3Ïò±»ÇëÇóµÄ½Úµã(NODE1)·µ»Ø³É¹¦£¬È»ºóNODE1Ïò¿Í»§¶Ë·µ»Ø³É¹¦¡£
7¡¢²éѯ²Ù×÷

Èý¡¢ÊµÏÖϸ½Ú
1¡¢term dictionary ºÍ term index
ΪÁËʵÏÖtermµÄ¿ìËÙ²éѯ£¬Ê¹ÓÃÁËʲôÑùµÄÊý¾Ý½á¹¹ÄØ£¿

¼ÙÉèÎÒÃÇÓкܶà¸öterm£¬±ÈÈ磺
Carla,Sara,Elin,Ada,Patty,Kate,Selena
Èç¹û°´ÕÕÕâÑùµÄ˳ÐòÅÅÁУ¬ÕÒ³öij¸öÌض¨µÄtermÒ»¶¨ºÜÂý£¬ÒòΪtermûÓÐÅÅÐò£¬ÐèҪȫ²¿¹ýÂËÒ»±é²ÅÄÜÕÒ³öÌض¨µÄterm¡£ÅÅÐòÖ®ºó¾Í±ä³ÉÁË£º
Ada,Carla,Elin,Kate,Patty,Sara,Selena
ÕâÑùÎÒÃÇ¿ÉÒÔÓöþ·Ö²éÕҵķ½Ê½£¬±ÈÈ«±éÀú¸ü¿ìµØÕÒ³öÄ¿±êµÄterm¡£Õâ¸ö¾ÍÊÇ term dictionary¡£ÓÐÁËterm dictionaryÖ®ºó£¬¿ÉÒÔÓà logN ´Î´ÅÅ̲éÕҵõ½Ä¿±ê¡£µ«ÊÇ´ÅÅ̵ÄËæ»ú¶Á²Ù×÷ÈÔÈ»ÊǷdz£°º¹óµÄ£¨Ò»´Îrandom access´ó¸ÅÐèÒª10msµÄʱ¼ä£©¡£ËùÒÔ¾¡Á¿ÉٵĶÁ´ÅÅÌ£¬ÓбØÒª°ÑһЩÊý¾Ý»º´æµ½ÄÚ´æÀï¡£µ«ÊÇÕû¸öterm dictionary±¾ÉíÓÖÌ«´óÁË£¬ÎÞ·¨ÍêÕûµØ·Åµ½ÄÚ´æÀï¡£ÓÚÊǾÍÓÐÁËterm index¡£term indexÓеãÏñÒ»±¾×ÖµäµÄ´óµÄÕÂ½Ú±í¡£±ÈÈ磺
A¿ªÍ·µÄterm ¡¡¡¡¡. XxxÒ³
C¿ªÍ·µÄterm ¡¡¡¡¡. XxxÒ³
E¿ªÍ·µÄterm ¡¡¡¡¡. XxxÒ³
Èç¹ûËùÓеÄterm¶¼ÊÇÓ¢ÎÄ×Ö·ûµÄ»°£¬¿ÉÄÜÕâ¸öterm index¾ÍÕæµÄÊÇ26¸öÓ¢ÎÄ×Ö·û±í¹¹³ÉµÄÁË¡£µ«ÊÇʵ¼ÊµÄÇé¿öÊÇ£¬termδ±Ø¶¼ÊÇÓ¢ÎÄ×Ö·û£¬term¿ÉÒÔÊÇÈÎÒâµÄbyteÊý×é¡£¶øÇÒ26¸öÓ¢ÎÄ×Ö·ûҲδ±ØÊÇÿһ¸ö×Ö·û¶¼ÓоùµÈµÄterm£¬±ÈÈçx×Ö·û¿ªÍ·µÄterm¿ÉÄÜÒ»¸ö¶¼Ã»ÓУ¬¶øs¿ªÍ·µÄtermÓÖÌرð¶à¡£Êµ¼ÊµÄterm indexÊÇÒ»¿Ãtrie Ê÷£º

Àý×ÓÊÇÒ»¸ö°üº¬ "A", "to", "tea", "ted", "ten", "i", "in", ºÍ "inn" µÄ trie Ê÷¡£Õâ¿ÃÊ÷²»»á°üº¬ËùÓеÄterm£¬Ëü°üº¬µÄÊÇtermµÄһЩǰ׺¡£Í¨¹ýterm index¿ÉÒÔ¿ìËٵض¨Î»µ½term dictionaryµÄij¸öoffset£¬È»ºó´ÓÕâ¸öλÖÃÔÙÍùºó˳Ðò²éÕÒ¡£ÔÙ¼ÓÉÏһЩѹËõ¼¼Êõ£¨ËÑË÷ Lucene Finite State Transducers£© term index µÄ³ß´ç¿ÉÒÔÖ»ÓÐËùÓÐtermµÄ³ß´çµÄ¼¸Ê®·ÖÖ®Ò»£¬Ê¹µÃÓÃÄڴ滺´æÕû¸öterm index±ä³É¿ÉÄÜ¡£ÕûÌåÉÏÀ´Ëµ¾ÍÊÇÕâÑùµÄЧ¹û¡£

ÏÖÔÚÎÒÃÇ¿ÉÒԻشð¡°ÎªÊ²Ã´Elasticsearch/Lucene¼ìË÷¿ÉÒÔ±Èmysql¿ìÁË¡£MysqlÖ»ÓÐterm dictionaryÕâÒ»²ã£¬ÊÇÒÔb-treeÅÅÐòµÄ·½Ê½´æ´¢ÔÚ´ÅÅÌÉϵġ£¼ìË÷Ò»¸ötermÐèÒªÈô¸É´ÎµÄrandom accessµÄ´ÅÅ̲Ù×÷¡£¶øLuceneÔÚterm dictionaryµÄ»ù´¡ÉÏÌí¼ÓÁËterm indexÀ´¼ÓËÙ¼ìË÷£¬term indexÒÔÊ÷µÄÐÎʽ»º´æÔÚÄÚ´æÖС£´Óterm index²éµ½¶ÔÓ¦µÄterm dictionaryµÄblockλÖÃÖ®ºó£¬ÔÙÈ¥´ÅÅÌÉÏÕÒterm£¬´ó´ó¼õÉÙÁË´ÅÅ̵Ärandom access´ÎÊý¡£
¶îÍâÖµµÃÒ»ÌáµÄÁ½µãÊÇ£ºterm indexÔÚÄÚ´æÖÐÊÇÒÔFST£¨finite state transducers£©µÄÐÎʽ±£´æµÄ£¬ÆäÌصãÊǷdz£½ÚÊ¡ÄÚ´æ¡£Term dictionaryÔÚ´ÅÅÌÉÏÊÇÒÔ·ÖblockµÄ·½Ê½±£´æµÄ£¬Ò»¸öblockÄÚ²¿ÀûÓù«¹²Ç°×ºÑ¹Ëõ£¬±ÈÈ綼ÊÇAb¿ªÍ·µÄµ¥´Ê¾Í¿ÉÒÔ°ÑAbÊ¡È¥¡£ÕâÑùterm dictionary¿ÉÒÔ±Èb-tree¸ü½ÚÔ¼´ÅÅ̿ռ䡣
2¡¢ÈçºÎÁªºÏË÷Òý²éѯ£¿
ËùÒÔ¸ø¶¨²éѯ¹ýÂËÌõ¼þ age=18 µÄ¹ý³Ì¾ÍÊÇÏÈ´Óterm indexÕÒµ½18ÔÚterm dictionaryµÄ´ó¸ÅλÖã¬È»ºóÔÙ´Óterm dictionaryÀᆱȷµØÕÒµ½18Õâ¸öterm£¬È»ºóµÃµ½Ò»¸öposting list»òÕßÒ»¸öÖ¸Ïòposting listλÖõÄÖ¸Õ롣ȻºóÔÙ²éѯ gender=Å® µÄ¹ý³ÌÒ²ÊÇÀàËƵġ£×îºóµÃ³ö age=18 AND gender=Å® ¾ÍÊÇ°ÑÁ½¸ö posting list ×öÒ»¸ö¡°Ó롱µÄºÏ²¢¡£
Õâ¸öÀíÂÛÉϵġ°Ó롱ºÏ²¢µÄ²Ù×÷¿É²»ÈÝÒס£¶ÔÓÚmysqlÀ´Ëµ£¬Èç¹ûÄã¸øageºÍgenderÁ½¸ö×ֶζ¼½¨Á¢ÁËË÷Òý£¬²éѯµÄʱºòÖ»»áÑ¡ÔñÆäÖÐ×îselectiveµÄÀ´Óã¬È»ºóÁíÍâÒ»¸öÌõ¼þÊÇÔÚ±éÀúÐеĹý³ÌÖÐÔÚÄÚ´æÖмÆËãÖ®ºó¹ýÂ˵ô¡£ÄÇôҪÈçºÎ²ÅÄÜÁªºÏʹÓÃÁ½¸öË÷ÒýÄØ£¿ÓÐÁ½ÖÖ°ì·¨£º
ʹÓÃskip listÊý¾Ý½á¹¹¡£Í¬Ê±±éÀúgenderºÍageµÄposting list£¬»¥Ïàskip£»
ʹÓÃbitsetÊý¾Ý½á¹¹£¬¶ÔgenderºÍageÁ½¸öfilter·Ö±ðÇó³öbitset£¬¶ÔÁ½¸öbitset×öAN²Ù×÷¡£
PostgreSQL ´Ó 8.4 °æ±¾¿ªÊ¼Ö§³Öͨ¹ýbitmapÁªºÏʹÓÃÁ½¸öË÷Òý£¬¾ÍÊÇÀûÓÃÁËbitsetÊý¾Ý½á¹¹À´×öµ½µÄ¡£µ±È»Ò»Ð©ÉÌÒµµÄ¹ØϵÐÍÊý¾Ý¿âÒ²Ö§³ÖÀàËƵÄÁªºÏË÷ÒýµÄ¹¦ÄÜ¡£ElasticsearchÖ§³ÖÒÔÉÏÁ½ÖÖµÄÁªºÏË÷Òý·½Ê½£¬Èç¹û²éѯµÄfilter»º´æµ½ÁËÄÚ´æÖУ¨ÒÔbitsetµÄÐÎʽ£©£¬ÄÇôºÏ²¢¾ÍÊÇÁ½¸öbitsetµÄAND¡£Èç¹û²éѯµÄfilterûÓлº´æ£¬ÄÇô¾ÍÓÃskip listµÄ·½Ê½È¥±éÀúÁ½¸öon diskµÄposting list¡£
1¡¢ÀûÓà Skip List ºÏ²¢

ÒÔÉÏÊÇÈý¸öposting list¡£ÎÒÃÇÏÖÔÚÐèÒª°ÑËüÃÇÓÃANDµÄ¹ØϵºÏ²¢£¬µÃ³öposting listµÄ½»¼¯¡£Ê×ÏÈÑ¡Ôñ×î¶ÌµÄposting list£¬È»ºó´ÓСµ½´ó±éÀú¡£±éÀúµÄ¹ý³Ì¿ÉÒÔÌø¹ýһЩԪËØ£¬±ÈÈçÎÒÃDZéÀúµ½ÂÌÉ«µÄ13µÄʱºò£¬¾Í¿ÉÒÔÌø¹ýÀ¶É«µÄ3ÁË£¬ÒòΪ3±È13ҪС¡£
Õû¸ö¹ý³ÌÈçÏÂ
Next -> 2
Advance(2) -> 13
Advance(13) -> 13
Already on 13
Advance(13) -> 13 MATCH!!!
Next -> 17
Advance(17) -> 22
Advance(22) -> 98
Advance(98) -> 98
Advance(98) -> 98 MATCH!!! |
×îºóµÃ³öµÄ½»¼¯ÊÇ[13,98]£¬ËùÐèµÄʱ¼ä±ÈÍêÕû±éÀúÈý¸öposting listÒª¿ìµÃ¶à¡£µ«ÊÇÇ°ÌáÊÇÿ¸ölistÐèÒªÖ¸³öAdvanceÕâ¸ö²Ù×÷£¬¿ìËÙÒƶ¯Ö¸ÏòµÄλÖá£Ê²Ã´ÑùµÄlist¿ÉÒÔÕâÑùAdvanceÍùÇ°×öÍÜÌø£¿skip list£º

´Ó¸ÅÄîÉÏÀ´Ëµ£¬¶ÔÓÚÒ»¸öºÜ³¤µÄposting list£¬±ÈÈ磺
[1,3,13,101,105,108,255,256,257]
ÎÒÃÇ¿ÉÒÔ°ÑÕâ¸ölist·Ö³ÉÈý¸öblock£º
[1,3,13] [101,105,108] [255,256,257]
È»ºó¿ÉÒÔ¹¹½¨³öskip listµÄµÚ¶þ²ã£º
[1,101,255]
1,101,255·Ö±ðÖ¸Ïò×Ô¼º¶ÔÓ¦µÄblock¡£ÕâÑù¾Í¿ÉÒԺܿìµØ¿çblockµÄÒƶ¯Ö¸ÏòλÖÃÁË¡£
Lucene×ÔÈ»»á¶ÔÕâ¸öblockÔٴνøÐÐѹËõ¡£ÆäѹËõ·½Ê½½Ð×öFrame Of Reference±àÂ롣ʾÀýÈçÏ£º

¿¼Âǵ½Æµ·±³öÏÖµÄterm£¨Ëùνlow cardinalityµÄÖµ£©£¬±ÈÈçgenderÀïµÄÄлòÕßÅ®¡£Èç¹ûÓÐ1°ÙÍò¸öÎĵµ£¬ÄÇôÐÔ±ðΪÄеÄposting listÀï¾Í»áÓÐ50Íò¸öintÖµ¡£ÓÃFrame of Reference±àÂë½øÐÐѹËõ¿ÉÒÔ¼«´ó¼õÉÙ´ÅÅÌÕ¼Óá£Õâ¸öÓÅ»¯¶ÔÓÚ¼õÉÙË÷Òý³ß´çÓзdz£ÖØÒªµÄÒâÒå¡£µ±È»mysql b-treeÀïÒ²ÓÐÒ»¸öÀàËƵÄposting listµÄ¶«Î÷£¬ÊÇδ¾¹ýÕâÑùѹËõµÄ¡£
ÒòΪÕâ¸öFrame of ReferenceµÄ±àÂëÊÇÓнâѹËõ³É±¾µÄ¡£ÀûÓÃskip list£¬³ýÁËÌø¹ýÁ˱éÀúµÄ³É±¾£¬Ò²Ìø¹ýÁ˽âѹËõÕâЩѹËõ¹ýµÄblockµÄ¹ý³Ì£¬´Ó¶ø½ÚÊ¡ÁËcpu¡£
2¡¢ÀûÓÃbitsetºÏ²¢
BitsetÊÇÒ»ÖÖºÜÖ±¹ÛµÄÊý¾Ý½á¹¹£¬¶ÔÓ¦posting listÈ磺
[1,3,4,7,10]
¶ÔÓ¦µÄbitset¾ÍÊÇ£º
[1,0,1,1,0,0,1,0,0,1]
ÿ¸öÎĵµ°´ÕÕÎĵµidÅÅÐò¶ÔÓ¦ÆäÖеÄÒ»¸öbit¡£Bitset×ÔÉí¾ÍÓÐѹËõµÄÌص㣬ÆäÓÃÒ»¸öbyte¾Í¿ÉÒÔ´ú±í8¸öÎĵµ¡£ËùÒÔ100Íò¸öÎĵµÖ»ÐèÒª12.5Íò¸öbyte¡£µ«ÊÇ¿¼Âǵ½Îĵµ¿ÉÄÜÓÐÊýÊ®ÒÚÖ®¶à£¬ÔÚÄÚ´æÀï±£´æbitsetÈÔÈ»ÊǺÜÉݳ޵ÄÊÂÇé¡£¶øÇÒ¶ÔÓÚ¸öÿһ¸öfilter¶¼ÒªÏûºÄÒ»¸öbitset£¬±ÈÈçage=18»º´æÆðÀ´µÄ»°ÊÇÒ»¸öbitset£¬18<=age<25ÊÇÁíÍâÒ»¸öfilter»º´æÆðÀ´Ò²ÒªÒ»¸öbitset¡£
ËùÒÔÃؾ÷¾ÍÔÚÓÚÐèÒªÓÐÒ»¸öÊý¾Ý½á¹¹£º
¿ÉÒÔºÜѹËõµØ±£´æÉÏÒÚ¸öbit´ú±í¶ÔÓ¦µÄÎĵµÊÇ·ñÆ¥Åäfilter£»
Õâ¸öѹËõµÄbitsetÈÔÈ»¿ÉÒԺܿìµØ½øÐÐANDºÍ ORµÄÂß¼²Ù×÷¡£
LuceneʹÓõÄÕâ¸öÊý¾Ý½á¹¹½Ð×ö Roaring Bitmap¡£

ÆäѹËõµÄ˼·ÆäʵºÜ¼òµ¥¡£ÓëÆä±£´æ100¸ö0£¬Õ¼ÓÃ100¸öbit¡£»¹²»Èç±£´æ0Ò»´Î£¬È»ºóÉùÃ÷Õâ¸ö0Öظ´ÁË100±é¡£
ÕâÁ½Öֺϲ¢Ê¹ÓÃË÷ÒýµÄ·½Ê½¶¼ÓÐÆäÓÃ;¡£Elasticsearch¶ÔÆäÐÔÄÜÓÐÏêϸµÄ¶Ô±È£¨https://www.elastic.co/blog/frame-of-reference-and-roaring-bitmaps£©¡£¼òµ¥µÄ½áÂÛÊÇ£ºÒòΪFrame of Reference±àÂëÊÇÈç´Ë ¸ßЧ£¬¶ÔÓÚ¼òµ¥µÄÏàµÈÌõ¼þµÄ¹ýÂË»º´æ³É´¿ÄÚ´æµÄbitset»¹²»ÈçÐèÒª·ÃÎÊ´ÅÅ̵Äskip listµÄ·½Ê½Òª¿ì¡£
3¡¢ÈçºÎ¼õÉÙÎĵµÊý£¿
Ò»ÖÖ³£¼ûµÄѹËõ´æ´¢Ê±¼äÐòÁеķ½Ê½ÊǰѶà¸öÊý¾ÝµãºÏ²¢³ÉÒ»ÐС£OpentsdbÖ§³Öº£Á¿Êý¾ÝµÄÒ»¸ö¾øÕоÍÊǶ¨ÆڰѺܶàÐÐÊý¾ÝºÏ²¢³ÉÒ»ÐУ¬Õâ¸ö¹ý³Ì½Ðcompaction¡£ÀàËƵÄvivdcortextʹÓÃmysql´æ´¢µÄʱºò£¬Ò²°ÑÒ»·ÖÖӵĺܶàÊý¾ÝµãºÏ²¢´æ´¢µ½mysqlµÄÒ»ÐÐÀïÒÔ¼õÉÙÐÐÊý¡£
Õâ¸ö¹ý³Ì¿ÉÒÔʾÀýÈçÏ£º

¿ÉÒÔ¿´µ½£¬Ðбä³ÉÁËÁÐÁË¡£Ã¿Ò»ÁпÉÒÔ´ú±íÕâÒ»·ÖÖÓÄÚÒ»ÃëµÄÊý¾Ý¡£
ElasticsearchÓÐÒ»¸ö¹¦ÄÜ¿ÉÒÔʵÏÖÀàËƵÄÓÅ»¯Ð§¹û£¬ÄǾÍÊÇNested Document¡£ÎÒÃÇ¿ÉÒÔ°ÑÒ»¶Îʱ¼äµÄºÜ¶à¸öÊý¾Ýµã´ò°ü´æ´¢µ½Ò»¸ö¸¸ÎĵµÀ±ä³ÉÆäǶÌ×µÄ×ÓÎĵµ¡£Ê¾ÀýÈçÏ£º
{timestamp:12:05:01, idc:sz, value1:10,value2:11}
{timestamp:12:05:02, idc:sz, value1:9,value2:9}
{timestamp:12:05:02, idc:sz, value1:18,value:17} |
¿ÉÒÔ´ò°ü³É£º
{
max_timestamp:12:05:02, min_timestamp: 1205:01, idc:sz,
records: [
{timestamp:12:05:01, value1:10,value2:11}
{timestamp:12:05:02, value1:9,value2:9}
{timestamp:12:05:02, value1:18,value:17}
]
} |
ÕâÑù¿ÉÒÔ°ÑÊý¾Ýµã¹«¹²µÄά¶È×Ö¶ÎÉÏÒƵ½¸¸ÎĵµÀ¶ø²»ÓÃÔÚÿ¸ö×ÓÎĵµÀïÖظ´´æ´¢£¬´Ó¶ø¼õÉÙË÷ÒýµÄ³ß´ç¡£

£¨Í¼Æ¬À´Ô´£ºhttps://www.youtube.com/watch?v=Su5SHc_uJw8£¬Faceting with Lucene Block Join Query£©
ÔÚ´æ´¢µÄʱºò£¬ÎÞÂÛ¸¸Îĵµ»¹ÊÇ×ÓÎĵµ£¬¶ÔÓÚLuceneÀ´Ëµ¶¼ÊÇÎĵµ£¬¶¼»áÓÐÎĵµId¡£µ«ÊǶÔÓÚǶÌ×ÎĵµÀ´Ëµ£¬¿ÉÒÔ±£´æÆð×ÓÎĵµºÍ¸¸ÎĵµµÄÎĵµidÊÇÁ¬ÐøµÄ£¬¶øÇÒ¸¸Îĵµ×ÜÊÇ×îºóÒ»¸ö¡£ÓÐÕâÑùÒ»¸öÅÅÐòÐÔ×÷Ϊ±£ÕÏ£¬ÄÇôÓÐÒ»¸öËùÓи¸ÎĵµµÄposting list¾Í¿ÉÒÔ¸ú×ÙËùÓеĸ¸×Ó¹Øϵ¡£Ò²¿ÉÒÔºÜÈÝÒ×µØÔÚ¸¸×ÓÎĵµidÖ®¼ä×öת»»¡£°Ñ¸¸×Ó¹ØϵҲÀí½âΪһ¸öfilter£¬ÄÇô²éѯʱ¼ìË÷µÄʱºò²»¹ýÊÇÓÖANDÁËÁíÍâÒ»¸öfilter¶øÒÑ¡£Ç°ÃæÎÒÃÇÒѾ¿´µ½ÁËElasticsearch¿ÉÒԷdz£¸ßЧµØ´¦Àí¶àfilterµÄÇé¿ö£¬³ä·ÖÀûÓõײãµÄË÷Òý¡£
ʹÓÃÁËǶÌ×ÎĵµÖ®ºó£¬¶ÔÓÚtermµÄposting listÖ»ÐèÒª±£´æ¸¸ÎĵµµÄdoc id¾Í¿ÉÒÔÁË£¬¿ÉÒԱȱ£´æËùÓеÄÊý¾ÝµãµÄdoc idÒªÉٺܶࡣÈç¹ûÎÒÃÇ¿ÉÒÔÔÚÒ»¸ö¸¸ÎĵµÀïÈûÈë50¸öǶÌ×Îĵµ£¬ÄÇôposting list¿ÉÒÔ±ä³É֮ǰµÄ1/50¡£
ËÄ¡¢²¹³ä

|