Введение в алгоритм больших данных


Анализ производительности веревочного фильтра


Download 156.6 Kb.
bet3/6
Sana30.04.2023
Hajmi156.6 Kb.
#1406915
TuriРеферат
1   2   3   4   5   6
Bog'liq
3.3.

1.4 Анализ производительности веревочного фильтра
После понимания вставки и поиска фильтра Buron, подумайте: является ли поиск в регистраторе фильтра Buron надежным? Очевидно, что это не обязательно, если существует также аналогичный конфликт хэшей: разные цели используют несколько хэш-функций, как вы это делаете? В это время может произойти ложное срабатывание, ясно, что элемента нет, но считается, что этот элемент есть, потому что рассматривается конфликт.
Good, because we can specify multiple different hash functions, make the probability of conflicts as low as possible, at least a lot less than the hash table. Someone countsBuron filter error PversusHash function number KBuron filter length MInsert the number of elements NThe relationship between:

Figure 4 Broadlink of the rope filter
It can be seen from the above figure,In the case where the number of inserts is inserted, the longer the length of the rulp filter, the more the number of the hash function is, the lower the false positive rate.
From the above analysis, since its complete use bits represent elements, and the false positive rate is low, it is very suitable for the target element to find tasks in the case where the 100% accuracy is not required.
Second, Bit Map
2.1 concept
Bit-map, use several points to record a state of an integer (serial number). What is the state? The simplest state should be existing. For example, it can be used to indicate that this number does not exist, and 1 this representation is present. This will determine whether some integers have appeared in Bitmap. Similar to the Buron filter, the maximum advantage of the bitmap is that its occupied space is small, and several bits can be used to represent an integer of 4 bytes or more spaces.
2.2 Insert
By reasonable, as long as the subscript is easy to insert a number in the bitmap, simply take the corresponding subscript position to 1. And how do you operate depends on what data structure creates bitmap:

  • Establishment method 1: Directly established Bitmap

int n=32*10000;// Number of elements to be expressed
bool* bitmap=new bool[n];// Establish Bitmap directly in place
bitmap[200]=1;// Insert element 200

  • 1

  • 2

  • 3

In this case, you can use the subscript to operate the state of each bit.

  • Establishment mode 2: Bitmap established in int

int* bitmap=new int[10000];// Establish Bitmap with a four-byte int type, which means a 4-byte int element of the int array can represent 32 integers.
bitmap[200/32]=bitmap[200/32] | (1<<(200%32));// At this point, you should use the bit operation to plug 200 into Bitmap.

  • 1

  • 2

This should be used to operate every bit in INT. "200/32" is positioned to element 200 should be inserted into the first few INT; "200% 32" indicates the first few positions of the current INT element; then the original int value and 1 << (200% 32) Take or you can complete the operation. (3% 32) will be shifted by 1 left)

Download 156.6 Kb.

Do'stlaringiz bilan baham:
1   2   3   4   5   6




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling