Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve results #3

Open
ctwardy opened this issue Mar 2, 2017 · 2 comments
Open

Improve results #3

ctwardy opened this issue Mar 2, 2017 · 2 comments

Comments

@ctwardy
Copy link

ctwardy commented Mar 2, 2017

Tested on a mix of regular domains and example MEMEX escort/human trafficking domains.
Accuracy not great. News and wiki do well.

  Category                                                URL       Test

0 news nytimes.com news
1 news foxnews.com news
2 news cnn.com news
3 news news.google.com news
4 news npr.org/sections/health-shots/2015/07/16/42326... news
5 news news.google.com/news/story?cf=all&hl=en&ned=us... news
6 news https://www.nytimes.com/2017/01/31/us/politics... news
7 forum ubuntuforums.org forum
8 forum arstechnica.com news
9 forum reddit.com undecided
10 forum ubuntuforums.org/showthread.php?t=937963 forum
11 wiki wiki.ubuntu.com/ForumCouncilAgenda wiki
12 wiki en.wikipedia.org wiki
13 wiki fr.wikipedia.org wiki
14 wiki ru.wikipedia.org wiki
15 wiki ar.wikipedia.org wiki
16 wiki en.wikipedia.org/wiki/denial-of-service_attack wiki
17 blog elegantthemes.com/blog/tips-tricks/how-to-crea... blog
18 blog grahamcluley.com news
19 blog erratasec.com undecided
20 blog krebsonsecurity.com undecided
21 blog joelonsoftware.com undecided
22 blog schneier.com undecided
23 blog troyhunt.com undecided
24 classified craigslist.com undecided
25 classified geebo.com undecided
26 classified backpage.com undecided
27 classified oodle.com undecided
28 classified classifiedads.com undecided
29 classified classifiedsgiant.com undecided
30 classified washingtonpost.com/classifieds undecided
31 classified insidenova.com/classifieds/ undecided
32 classified close5.com/l/arlington-virginia undecided
33 classified meetaphrodite.com undecided
34 classified adultsearch.com/account/signin?origin=%2Fkansa... undecided
35 classified adultsearch.com/kansas/ undecided
36 classified bronx.backpage.com/ undecided
37 classified brooklyn.backpage.com/BodyRubs/sexy-dolls-just... undecided
38 shopping etsy.com/treasury/tags/original+gift undecided
39 shopping overstock.com undecided
40 shopping amazon.com undecided
41 shopping marieclaire.com news
42 shopping ca.boohoo.com undecided
43 shopping walmart.com news
44 shopping grocery.walmart.com undecided
45 undecided ibm.com undecided
46 undecided apple.com undecided
47 undecided nasa.gov undecided
48 undecided jpl.nasa.gov undecided
49 undecided memex.jpl.nasa.gov undecided
50 undecided soteradefense.com undecided
51 undecided google.com undecided

@ctwardy
Copy link
Author

ctwardy commented Mar 3, 2017

Now scoring 0.51.

*ACCURACY*: 29/57 = 0.51

@ctwardy
Copy link
Author

ctwardy commented Mar 3, 2017

So... when to close the ticket?

ctwardy added a commit that referenced this issue Mar 3, 2017
Improved #3 (accuracy), #4 (HTTP error).

Accuracy: 29/57 = 51%
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant