forked from TheWeatherChannel/dClass
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
80 lines (55 loc) · 2.68 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
dClass - Device Classification Engine
HOWTO
To compile the test client, run make in the src directory.
To build with varnish and nginx, please reference the varnish and nginx
subdirectories.
To integrate with the dClass API:
-include the dclass header file:
#include "dclass_client.h"
-define a dclass_index:
dclass_index dci;
-populate the index using a dtree file or OpenDDR resource file:
dclass_load_file(&dci,"/path/to/file.dtree");
-OR-
openddr_load_resources(&dci,/path/to/openddr/resources");
-classify a string against the index and get the resulting kv data:
dclass_keyvalue *kv=dclass_classify(&dci,"this is a string");
char *id=kv->id;
char *field_xyz=dclass_get_kvalue(kv,"xyz");
-freeing the index:
dclass_free(&dci);
AUTHORS
Reza Naghibi ([email protected])
Joe Pearson ([email protected])
Anthony Watson ([email protected])
Eric Honer ([email protected])
Luke Kolin ([email protected])
Ivan Kozhuharov ([email protected])
Chris Hill ([email protected])
Chris McClellen ([email protected])
OpenDDR (http://openddr.org/)
NOTES
The goal of this project is to quickly and accurately classify internet browser
user-agent strings. This project was built ground up to do so.
Please reference test.dtree in the dtrees subdirectory for detailed pattern
notes.
Performance is highly dependant on the conciseness of the patterns and not on
the number of patterns. On a 2.2Ghz Intel i7, the average time spent in
dclass_classify() using the OpenDDR pattern set and a random set of user agents
is 800ns. Out of the box, memory usage for the OpenDDR pattern set is right at
3mb. This value can go up or down depending on the key fields loaded from
OpenDDR (OPENDDR_KEYS) and various index settings.
All US-ASCII alphanumeric characters are pattern searchable. Non alphanumeric
pattern searchable characters are defined in DTREE_HASH_SCHARS. Add or remove
characters as needed. Indexed US-ASCII print characters (0x20 thru 0x7E) which
aren't pattern searchable are replaced with DTREE_PATTERN_ANY and can match on
any character. All pattern matching is US-ASCII case insensitive.
Write operations on the index are not thread safe. Read operations are thread
safe (with at most one writer). Read operations have the dclass index parameter
designated with a 'const'. The index is designed for multiple readers and
writers, however such a feature is currently outside of the foreseeable scope
of this project.
Memory limits are tightly bounded. Default configuration allows for 65k search
nodes and 8mb of general memory. Adjusting DTREE_DT_PACKED* will allow for more
search nodes and increasing DTREE_M_MAX_SLABS will allow for more general use
memory.