Skip to content

Files

Latest commit

 

History

History
1443 lines (1337 loc) · 85.4 KB

README.md

File metadata and controls

1443 lines (1337 loc) · 85.4 KB

Pagerank Project

Part 2 - Word2Vec

task 1:

At the end of task 1, I saved the code into pagerank.py file and by running

python3 pagerank.py --data=data/lawfareblog.csv.gz --search_query='weapons'

Output:

INFO:gensim.models.keyedvectors:precomputing L2-norms of word weight vectors
INFO:root:rank=0 pagerank=4.5715e-03 url=www.lawfareblog.com/why-did-you-wait-moral-emptiness-and-drone-strikes
INFO:root:rank=1 pagerank=3.1107e-03 url=www.lawfareblog.com/dc-district-court-dismisses-journalists-drone-lawsuit
INFO:root:rank=2 pagerank=2.0231e-03 url=www.lawfareblog.com/revived-cia-drone-strike-program-comments-new-policy
INFO:root:rank=3 pagerank=1.9667e-03 url=www.lawfareblog.com/us-court-appeals-dc-circuit-dismisses-suit-over-us-drone-strike
INFO:root:rank=4 pagerank=1.1788e-03 url=www.lawfareblog.com/iran-shoots-down-us-drone-domestic-and-international-legal-implications
INFO:root:rank=5 pagerank=1.1620e-03 url=www.lawfareblog.com/slaughterbots-and-other-anticipated-autonomous-weapons-problems
INFO:root:rank=6 pagerank=1.1276e-03 url=www.lawfareblog.com/german-courts-weigh-legal-responsibility-us-drone-strikes
INFO:root:rank=7 pagerank=8.3738e-04 url=www.lawfareblog.com/shift-jsoc-drone-strikes-does-not-mean-cia-has-been-sidelined
INFO:root:rank=8 pagerank=7.8704e-04 url=www.lawfareblog.com/atomwaffen-division-member-pleads-guilty-firearms-charge
INFO:root:rank=9 pagerank=7.8570e-04 url=www.lawfareblog.com/waiving-imminent-threat-test-cia-drone-strikes-pakistan

task 2

run the command:

python3 pagerank2.py --data=data/lawfareblog.csv.gz --search_query='biden' --s_weight=.01

Output :

INFO:gensim.models.keyedvectors:precomputing L2-norms of word weight vectors
INFO:root:rank=0 pagerank=0.02051282301545143 url=www.lawfareblog.com/new-obama-biden-campaign-bumper-sticker
INFO:root:rank=1 pagerank=0.0109577476978302 url=www.lawfareblog.com/trump-encourages-china-investigate-biden-family-clouding-trade-talks
INFO:root:rank=2 pagerank=0.010501262731850147 url=www.lawfareblog.com/live-brookings-host-vice-president-joe-biden
INFO:root:rank=3 pagerank=0.005239618942141533 url=www.lawfareblog.com/document-trump-revokes-obama-executive-order-counterterrorism-strike-casualty-reporting
INFO:root:rank=4 pagerank=0.00207848334684968 url=www.lawfareblog.com/obamas-term-end-thoughts-targeted-killing
INFO:root:rank=5 pagerank=0.0009015162941068411 url=www.lawfareblog.com/thoughts-about-obama-administrations-counterterrorism-paradigm-light-al-liby-and-ikrima-operations
INFO:root:rank=6 pagerank=0.000671088753733784 url=www.lawfareblog.com/president-obama-secures-final-senate-vote-necessary-defend-iran-deal
INFO:root:rank=7 pagerank=0.0006702161044813693 url=www.lawfareblog.com/obamas-moral-muse
INFO:root:rank=8 pagerank=0.0006524198688566685 url=www.lawfareblog.com/obamas-legacy-law-transparency-and-politics-anguish
INFO:root:rank=9 pagerank=0.0006270432495512068 url=www.lawfareblog.com/speaking-law-obama-administrations-addresses-national-security-law

The weight is automatically set to 0.03 and can be changed to calculate the query score using the -- s_weight = x.x in the python3 run command above

Old Code

In this project, you will create a simple search engine for the website https://www.lawfareblog.com. This website provides legal analysis on US national security issues.

Due date: Sunday, 18 September at midnight

Computation: This project has low computational requirements. You are not required to complete it on the lambda server (although you are welcome to if you'd like).

Background

Data:

The data folder contains two files that store example "web graphs". The file small.csv.gz contains the example graph from the Deeper Inside Pagerank paper. This is a small graph, so we can manually inspect the contents of this file with the following command:

$ zcat data/small.csv.gz
source,target
1,2
1,3
3,1
3,2
3,5
4,5
4,6
5,6
5,4
6,4

Recall: The cat terminal command outputs the contents of a file to stdout, and the zcat command first decompressed a gzipped file and then outputs the decompressed contents.

As you can see, the graph is stored as a CSV file. The first line is a header, and each subsequent line stores a single edge in the graph. The first row contains the source node of the edge and the second row the target node. The file is assumed to be sorted alphabetically.

The second data file lawfareblog.csv.gz contains the link structure for the lawfare blog. Let's take a look at the first 10 of these lines:

$ zcat data/lawfareblog.csv.gz | head
source,target
www.lawfareblog.com/,www.lawfareblog.com/topic/interrogation
www.lawfareblog.com/,www.lawfareblog.com/upcoming-events
www.lawfareblog.com/,www.lawfareblog.com/
www.lawfareblog.com/,www.lawfareblog.com/our-comments-policy
www.lawfareblog.com/,www.lawfareblog.com/litigation-documents-related-appointment-matthew-whitaker-acting-attorney-general
www.lawfareblog.com/,www.lawfareblog.com/topic/lawfare-research-paper-series
www.lawfareblog.com/,www.lawfareblog.com/topic/book-reviews
www.lawfareblog.com/,www.lawfareblog.com/documents-related-mueller-investigation
www.lawfareblog.com/,www.lawfareblog.com/topic/international-law-loac

You can see that in this file, the node names are URLs. Semantically, each line corresponds to an HTML <a> tag that is contained in the source webpage and links to the target webpage.

We can use the following command to count the total number of links in the file:

$ zcat data/lawfareblog.csv.gz | wc -l
1610789

Since every link corresponds to a non-zero entry in the P matrix, this is also the value of nnz(P). (Technically, we should subtract 1 from this value since the wc -l command also counts the header line, not just the data lines.)

To get the dimensions of P, we need to count the total number of nodes in the graph. The following command achieves this by: decompressing the file, extracting the first column, removing all duplicate lines, then counting the results.

$ zcat data/lawfareblog.csv.gz | cut -f1 -d, | uniq | wc -l
25761

This matrix is large enough that computing matrix products for dense matrices takes several minutes on a single CPU. Fortunately, however, the matrix is very sparse. The following python code computes the fraction of entries in the matrix with non-zero values:

>>> 1610788 / (25760**2)
0.0024274297384360172

Thus, by using sparse matrix operations, we will be able to speed up the code significantly.

Code:

The pagerank.py file contains code for loading the graph CSV files and searching through their nodes for key phrases. For example, you can perform a search for all nodes (i.e. urls) that mention the string corona with the following command:

$ python3 pagerank.py --data=data/lawfareblog.csv.gz --verbose --search_query=corona

NOTE: It will take about 10 seconds to load and parse the data files. All the other computation happens essentially instantly.

Currently, the pagerank of the nodes is not currently being calculated correctly, and so the webpages are returned in an arbitrary order. Your task in this assignment will be to fix these calculations in order to have the most important results (i.e. highest pagerank results) returned first.

Task 1: the power method

Implement the WebGraph.power_method function in pagerank.py for computing the pagerank vector by fixing the FIXME annotation.

Part 1:

To check that your implementation is working, you should run the program on the data/small.csv.gz graph. For my implementation, I get the following output.

$ python3 pagerank.py --data=data/small.csv.gz --verbose
DEBUG:root:computing indices
DEBUG:root:computing values
DEBUG:root:i=0 residual=2.5629e-01
DEBUG:root:i=1 residual=1.1841e-01
DEBUG:root:i=2 residual=7.0701e-02
DEBUG:root:i=3 residual=3.1815e-02
DEBUG:root:i=4 residual=2.0497e-02
DEBUG:root:i=5 residual=1.0108e-02
DEBUG:root:i=6 residual=6.3716e-03
DEBUG:root:i=7 residual=3.4228e-03
DEBUG:root:i=8 residual=2.0879e-03
DEBUG:root:i=9 residual=1.1750e-03
DEBUG:root:i=10 residual=7.0131e-04
DEBUG:root:i=11 residual=4.0321e-04
DEBUG:root:i=12 residual=2.3800e-04
DEBUG:root:i=13 residual=1.3812e-04
DEBUG:root:i=14 residual=8.1083e-05
DEBUG:root:i=15 residual=4.7251e-05
DEBUG:root:i=16 residual=2.7704e-05
DEBUG:root:i=17 residual=1.6164e-05
DEBUG:root:i=18 residual=9.4778e-06
DEBUG:root:i=19 residual=5.5066e-06
DEBUG:root:i=20 residual=3.2042e-06
DEBUG:root:i=21 residual=1.8612e-06
DEBUG:root:i=22 residual=1.1283e-06
DEBUG:root:i=23 residual=6.1907e-07
INFO:root:rank=0 pagerank=6.6270e-01 url=4
INFO:root:rank=1 pagerank=5.2179e-01 url=6
INFO:root:rank=2 pagerank=4.1434e-01 url=5
INFO:root:rank=3 pagerank=2.3175e-01 url=2
INFO:root:rank=4 pagerank=1.8590e-01 url=3
INFO:root:rank=5 pagerank=1.6917e-01 url=1

Yours likely won't be identical (due to weird floating point issues), but it should be similar. In particular, the ranking of the nodes/urls should be the same order.

NOTE: The --verbose flag causes all of the lines beginning with DEBUG to be printed. By default, only lines beginning with INFO are printed.

Part 2:

The pagerank.py file has an option --search_query, which takes a string as a parameter. If this argument is used, then the program returns all nodes that match the query string sorted according to their pagerank. Essentially, this gives us the most important pages related to our query.

Again, you may not get the exact same results as me, but you should get similar results to the examples I've shown below. Verify that you do in fact get similar results.

$ python3 pagerank.py --data=data/lawfareblog.csv.gz --search_query='corona'
INFO:root:rank=0 pagerank=1.0038e-03 url=www.lawfareblog.com/lawfare-podcast-united-nations-and-coronavirus-crisis
INFO:root:rank=1 pagerank=8.9224e-04 url=www.lawfareblog.com/house-oversight-committee-holds-day-two-hearing-government-coronavirus-response
INFO:root:rank=2 pagerank=7.0390e-04 url=www.lawfareblog.com/britains-coronavirus-response
INFO:root:rank=3 pagerank=6.9153e-04 url=www.lawfareblog.com/prosecuting-purposeful-coronavirus-exposure-terrorism
INFO:root:rank=4 pagerank=6.7041e-04 url=www.lawfareblog.com/israeli-emergency-regulations-location-tracking-coronavirus-carriers
INFO:root:rank=5 pagerank=6.6256e-04 url=www.lawfareblog.com/why-congress-conducting-business-usual-face-coronavirus
INFO:root:rank=6 pagerank=6.5046e-04 url=www.lawfareblog.com/congressional-homeland-security-committees-seek-ways-support-state-federal-responses-coronavirus
INFO:root:rank=7 pagerank=6.3620e-04 url=www.lawfareblog.com/paper-hearing-experts-debate-digital-contact-tracing-and-coronavirus-privacy-concerns
INFO:root:rank=8 pagerank=6.1248e-04 url=www.lawfareblog.com/house-subcommittee-voices-concerns-over-us-management-coronavirus
INFO:root:rank=9 pagerank=6.0187e-04 url=www.lawfareblog.com/livestream-house-oversight-committee-holds-hearing-government-coronavirus-response

$ python3 pagerank.py --data=data/lawfareblog.csv.gz --search_query='trump'
INFO:root:rank=0 pagerank=5.7826e-03 url=www.lawfareblog.com/trump-asks-supreme-court-stay-congressional-subpeona-tax-returns
INFO:root:rank=1 pagerank=5.2338e-03 url=www.lawfareblog.com/document-trump-revokes-obama-executive-order-counterterrorism-strike-casualty-reporting
INFO:root:rank=2 pagerank=5.1297e-03 url=www.lawfareblog.com/trump-administrations-worrying-new-policy-israeli-settlements
INFO:root:rank=3 pagerank=4.6599e-03 url=www.lawfareblog.com/dc-circuit-overrules-district-courts-due-process-ruling-qasim-v-trump
INFO:root:rank=4 pagerank=4.5934e-03 url=www.lawfareblog.com/donald-trump-and-politically-weaponized-executive-branch
INFO:root:rank=5 pagerank=4.3071e-03 url=www.lawfareblog.com/how-trumps-approach-middle-east-ignores-past-future-and-human-condition
INFO:root:rank=6 pagerank=4.0935e-03 url=www.lawfareblog.com/why-trump-cant-buy-greenland
INFO:root:rank=7 pagerank=3.7591e-03 url=www.lawfareblog.com/oral-argument-summary-qassim-v-trump
INFO:root:rank=8 pagerank=3.4509e-03 url=www.lawfareblog.com/dc-circuit-court-denies-trump-rehearing-mazars-case
INFO:root:rank=9 pagerank=3.4484e-03 url=www.lawfareblog.com/second-circuit-rules-mazars-must-hand-over-trump-tax-returns-new-york-prosecutors

$ python3 pagerank.py --data=data/lawfareblog.csv.gz --search_query='iran'
INFO:root:rank=0 pagerank=4.5746e-03 url=www.lawfareblog.com/praise-presidents-iran-tweets
INFO:root:rank=1 pagerank=4.4174e-03 url=www.lawfareblog.com/how-us-iran-tensions-could-disrupt-iraqs-fragile-peace
INFO:root:rank=2 pagerank=2.6928e-03 url=www.lawfareblog.com/cyber-command-operational-update-clarifying-june-2019-iran-operation
INFO:root:rank=3 pagerank=1.9391e-03 url=www.lawfareblog.com/aborted-iran-strike-fine-line-between-necessity-and-revenge
INFO:root:rank=4 pagerank=1.5452e-03 url=www.lawfareblog.com/parsing-state-departments-letter-use-force-against-iran
INFO:root:rank=5 pagerank=1.5357e-03 url=www.lawfareblog.com/iranian-hostage-crisis-and-its-effect-american-politics
INFO:root:rank=6 pagerank=1.5258e-03 url=www.lawfareblog.com/announcing-united-states-and-use-force-against-iran-new-lawfare-e-book
INFO:root:rank=7 pagerank=1.4221e-03 url=www.lawfareblog.com/us-names-iranian-revolutionary-guard-terrorist-organization-and-sanctions-international-criminal
INFO:root:rank=8 pagerank=1.1788e-03 url=www.lawfareblog.com/iran-shoots-down-us-drone-domestic-and-international-legal-implications
INFO:root:rank=9 pagerank=1.1463e-03 url=www.lawfareblog.com/israel-iran-syria-clash-and-law-use-force

Part 3:

The webgraph of lawfareblog.com (i.e. the P matrix) naturally contains a lot of structure. For example, essentially all pages on the domain have links to the root page https://lawfareblog.com/ and other "non-article" pages like https://www.lawfareblog.com/topics and https://www.lawfareblog.com/subscribe-lawfare. These pages therefore have a large pagerank. We can get a list of the pages with the largest pagerank by running

$ python3 pagerank.py --data=data/lawfareblog.csv.gz
INFO:root:rank=0 pagerank=2.8741e-01 url=www.lawfareblog.com/lawfare-job-board
INFO:root:rank=1 pagerank=2.8741e-01 url=www.lawfareblog.com/masthead
INFO:root:rank=2 pagerank=2.8741e-01 url=www.lawfareblog.com/litigation-documents-related-appointment-matthew-whitaker-acting-attorney-general
INFO:root:rank=3 pagerank=2.8741e-01 url=www.lawfareblog.com/documents-related-mueller-investigation
INFO:root:rank=4 pagerank=2.8741e-01 url=www.lawfareblog.com/topics
INFO:root:rank=5 pagerank=2.8741e-01 url=www.lawfareblog.com/about-lawfare-brief-history-term-and-site
INFO:root:rank=6 pagerank=2.8741e-01 url=www.lawfareblog.com/snowden-revelations
INFO:root:rank=7 pagerank=2.8741e-01 url=www.lawfareblog.com/support-lawfare
INFO:root:rank=8 pagerank=2.8741e-01 url=www.lawfareblog.com/upcoming-events
INFO:root:rank=9 pagerank=2.8741e-01 url=www.lawfareblog.com/our-comments-policy

Most of these pages are not very interesting, however, because they are not articles, and usually when we are performing a web search, we only want articles.

This raises the question: How can we find the most important articles filtering out the non-article pages? The answer is to modify the P matrix by removing all links to non-article pages.

This raises another question: How do we know if a link is a non-article page? Unfortunately, this is a hard question to answer with 100% accuracy, but there are many methods that get us most of the way there. One easy to implement method is to compute what's called the "in-link ratio" of each node (i.e. the total number of edges with the node as a target divided by the total number of nodes), and then remove nodes from the search results with too-high of a ratio. The intuition is that non-article pages often appear in the menu of a webpage, and so have links from almost all of the other webpages; but article-webpages are unlikely to appear on a menu and so will only have a small number of links from other webpages. The --filter_ratio parameter causes the code to remove all pages that have an in-link ratio larger than the provided value.

Using this option, we can estimate the most important articles on the domain with the following command:

$ python3 pagerank.py --data=data/lawfareblog.csv.gz --filter_ratio=0.2
INFO:root:rank=0 pagerank=3.4696e-01 url=www.lawfareblog.com/trump-asks-supreme-court-stay-congressional-subpeona-tax-returns
INFO:root:rank=1 pagerank=2.9521e-01 url=www.lawfareblog.com/livestream-nov-21-impeachment-hearings-0
INFO:root:rank=2 pagerank=2.9040e-01 url=www.lawfareblog.com/opening-statement-david-holmes
INFO:root:rank=3 pagerank=1.5179e-01 url=www.lawfareblog.com/lawfare-podcast-ben-nimmo-whack-mole-game-disinformation
INFO:root:rank=4 pagerank=1.5099e-01 url=www.lawfareblog.com/todays-headlines-and-commentary-1963
INFO:root:rank=5 pagerank=1.5099e-01 url=www.lawfareblog.com/todays-headlines-and-commentary-1964
INFO:root:rank=6 pagerank=1.5071e-01 url=www.lawfareblog.com/lawfare-podcast-week-was-impeachment
INFO:root:rank=7 pagerank=1.4957e-01 url=www.lawfareblog.com/todays-headlines-and-commentary-1962
INFO:root:rank=8 pagerank=1.4367e-01 url=www.lawfareblog.com/cyberlaw-podcast-mistrusting-google
INFO:root:rank=9 pagerank=1.4240e-01 url=www.lawfareblog.com/lawfare-podcast-bonus-edition-gordon-sondland-vs-committee-no-bull

Notice that the urls in this list look much more like articles than the urls in the previous list.

When Google calculates their P matrix for the web, they use a similar (but much more complicated) process to modify the P matrix in order to reduce spam results. The exact formula they use is a jealously guarded secret that they update continuously.

In the case above, notice that we have accidentally removed the blog's most popular article (<www.lawfareblog.com/snowden-revelations>). The blog editors believed that Snowden's revelations about NSA spying are so important that they directly put a link to the article on the menu. So every single webpage in the domain links to the Snowden article, and our "anti-spam" --filter-ratio argument removed this article from the list. In general, it is a challenging open problem to remove spam from pagerank results, and all current solutions rely on careful human tuning and still have lots of false positives and false negatives.

Part 4:

Recall from the reading that the runtime of pagerank depends heavily on the eigengap of the \bar\bar P matrix, and that this eigengap is bounded by the alpha parameter.

Run the following four commands:

$ python3 pagerank.py --data=data/lawfareblog.csv.gz --verbose 
$ python3 pagerank.py --data=data/lawfareblog.csv.gz --verbose --alpha=0.99999
$ python3 pagerank.py --data=data/lawfareblog.csv.gz --verbose --filter_ratio=0.2
$ python3 pagerank.py --data=data/lawfareblog.csv.gz --verbose --filter_ratio=0.2 --alpha=0.99999

You should notice that the last command takes considerably more iterations to compute the pagerank vector. (My code takes 685 iterations for this call, and about 10 iterations for all the others.)

This raises the question: Why does the second command (with the --alpha option but without the --filter_ratio) option not take a long time to run? The answer is that the P graph for https://www.lawfareblog.com naturally has a large eigengap and so is fast to compute for all alpha values, but the modified graph does not have a large eigengap and so requires a small alpha for fast convergence.

Changing the value of alpha also gives us very different pagerank rankings. For example,

$ python3 pagerank_solution.py --data=data/lawfareblog.csv.gz --verbose --filter_ratio=0.2
INFO:root:rank=0 pagerank=3.4696e-01 url=www.lawfareblog.com/trump-asks-supreme-court-stay-congressional-subpeona-tax-returns
INFO:root:rank=1 pagerank=2.9521e-01 url=www.lawfareblog.com/livestream-nov-21-impeachment-hearings-0
INFO:root:rank=2 pagerank=2.9040e-01 url=www.lawfareblog.com/opening-statement-david-holmes
INFO:root:rank=3 pagerank=1.5179e-01 url=www.lawfareblog.com/lawfare-podcast-ben-nimmo-whack-mole-game-disinformation
INFO:root:rank=4 pagerank=1.5099e-01 url=www.lawfareblog.com/todays-headlines-and-commentary-1963
INFO:root:rank=5 pagerank=1.5099e-01 url=www.lawfareblog.com/todays-headlines-and-commentary-1964
INFO:root:rank=6 pagerank=1.5071e-01 url=www.lawfareblog.com/lawfare-podcast-week-was-impeachment
INFO:root:rank=7 pagerank=1.4957e-01 url=www.lawfareblog.com/todays-headlines-and-commentary-1962
INFO:root:rank=8 pagerank=1.4367e-01 url=www.lawfareblog.com/cyberlaw-podcast-mistrusting-google
INFO:root:rank=9 pagerank=1.4240e-01 url=www.lawfareblog.com/lawfare-podcast-bonus-edition-gordon-sondland-vs-committee-no-bull

$ python3 pagerank_solution.py --data=data/lawfareblog.csv.gz --verbose --filter_ratio=0.2 --alpha=0.99999
INFO:root:rank=0 pagerank=7.0149e-01 url=www.lawfareblog.com/covid-19-speech-and-surveillance-response
INFO:root:rank=1 pagerank=7.0149e-01 url=www.lawfareblog.com/lawfare-live-covid-19-speech-and-surveillance
INFO:root:rank=2 pagerank=1.0552e-01 url=www.lawfareblog.com/cost-using-zero-days
INFO:root:rank=3 pagerank=3.1755e-02 url=www.lawfareblog.com/lawfare-podcast-former-congressman-brian-baird-and-daniel-schuman-how-congress-can-continue-function
INFO:root:rank=4 pagerank=2.2040e-02 url=www.lawfareblog.com/events
INFO:root:rank=5 pagerank=1.6027e-02 url=www.lawfareblog.com/water-wars-increased-us-focus-indo-pacific
INFO:root:rank=6 pagerank=1.6026e-02 url=www.lawfareblog.com/water-wars-drill-maybe-drill
INFO:root:rank=7 pagerank=1.6023e-02 url=www.lawfareblog.com/water-wars-disjointed-operations-south-china-sea
INFO:root:rank=8 pagerank=1.6020e-02 url=www.lawfareblog.com/water-wars-song-oil-and-fire
INFO:root:rank=9 pagerank=1.6020e-02 url=www.lawfareblog.com/water-wars-sinking-feeling-philippine-china-relations

Which of these rankings is better is entirely subjective, and the only way to know if you have the "best" alpha for your application is to try several variations and see what is best. If large alphas are good for your application, you can see that there is a trade-off between quality answers and algorithmic runtime. We'll be exploring this trade-off more formally in class over the rest of the semester.

Task 2: the personalization vector

The most interesting applications of pagerank involve the personalization vector. Implement the WebGraph.make_personalization_vector function so that it outputs a personalization vector tuned for the input query. The pseudocode for the function is:

for each index in the personalization vector:
    get the url for the index (see the _url_to_index function)
    check if the url satisfies the input query (see the url_satisfies_query function)
    if so, set the corresponding index to one
normalize the vector

Part 1:

The command line argument --personalization_vector_query will use the function you created above to augment your search with a custom personalization vector. If you've implemented the function correctly, you should get results similar to:

$ python3 pagerank.py --data=data/lawfareblog.csv.gz --filter_ratio=0.2 --personalization_vector_query='corona'
INFO:root:rank=0 pagerank=6.3127e-01 url=www.lawfareblog.com/covid-19-speech-and-surveillance-response
INFO:root:rank=1 pagerank=6.3124e-01 url=www.lawfareblog.com/lawfare-live-covid-19-speech-and-surveillance
INFO:root:rank=2 pagerank=1.5947e-01 url=www.lawfareblog.com/chinatalk-how-party-takes-its-propaganda-global
INFO:root:rank=3 pagerank=1.2209e-01 url=www.lawfareblog.com/brexit-not-immune-coronavirus
INFO:root:rank=4 pagerank=1.2209e-01 url=www.lawfareblog.com/rational-security-my-corona-edition
INFO:root:rank=5 pagerank=9.3360e-02 url=www.lawfareblog.com/trump-cant-reopen-country-over-state-objections
INFO:root:rank=6 pagerank=9.1920e-02 url=www.lawfareblog.com/prosecuting-purposeful-coronavirus-exposure-terrorism
INFO:root:rank=7 pagerank=9.1920e-02 url=www.lawfareblog.com/britains-coronavirus-response
INFO:root:rank=8 pagerank=7.7770e-02 url=www.lawfareblog.com/lawfare-podcast-united-nations-and-coronavirus-crisis
INFO:root:rank=9 pagerank=7.2888e-02 url=www.lawfareblog.com/house-oversight-committee-holds-day-two-hearing-government-coronavirus-response

Notice that these results are significantly different than when using the --search_query option:

$ python3 pagerank.py --data=data/lawfareblog.csv.gz --filter_ratio=0.2 --search_query='corona'
INFO:root:rank=0 pagerank=8.1320e-03 url=www.lawfareblog.com/house-oversight-committee-holds-day-two-hearing-government-coronavirus-response
INFO:root:rank=1 pagerank=7.7908e-03 url=www.lawfareblog.com/lawfare-podcast-united-nations-and-coronavirus-crisis
INFO:root:rank=2 pagerank=5.2262e-03 url=www.lawfareblog.com/livestream-house-oversight-committee-holds-hearing-government-coronavirus-response
INFO:root:rank=3 pagerank=3.9584e-03 url=www.lawfareblog.com/britains-coronavirus-response
INFO:root:rank=4 pagerank=3.8114e-03 url=www.lawfareblog.com/prosecuting-purposeful-coronavirus-exposure-terrorism
INFO:root:rank=5 pagerank=3.3973e-03 url=www.lawfareblog.com/paper-hearing-experts-debate-digital-contact-tracing-and-coronavirus-privacy-concerns
INFO:root:rank=6 pagerank=3.3633e-03 url=www.lawfareblog.com/cyberlaw-podcast-how-israel-fighting-coronavirus
INFO:root:rank=7 pagerank=3.3557e-03 url=www.lawfareblog.com/israeli-emergency-regulations-location-tracking-coronavirus-carriers
INFO:root:rank=8 pagerank=3.2160e-03 url=www.lawfareblog.com/congress-needs-coronavirus-failsafe-its-too-late
INFO:root:rank=9 pagerank=3.1036e-03 url=www.lawfareblog.com/why-congress-conducting-business-usual-face-coronavirus

Which results are better? Again, that depends on what you mean by "better." With the --personalization_vector_query option, a webpage is important only if other coronavirus webpages also think it's important; with the --search_query option, a webpage is important if any other webpage thinks it's important. You'll notice that in the later example, many of the webpages are about Congressional proceedings related to the coronavirus. From a strictly coronavirus perspective, these are not very important webpages. But in the broader context of national security, these are very important webpages.

Google engineers spend TONs of time fine-tuning their pagerank personalization vectors to remove spam webpages. Exactly how they do this is another one of their secrets that they don't publicly talk about.

Part 2:

Another use of the --personalization_vector_query option is that we can find out what webpages are related to the coronavirus but don't directly mention the coronavirus. This can be used to map out what types of topics are similar to the coronavirus.

For example, the following query ranks all webpages by their corona importance, but removes webpages mentioning corona from the results.

$ python3 pagerank.py --data=data/lawfareblog.csv.gz --filter_ratio=0.2 --personalization_vector_query='corona' --search_query='-corona'
INFO:root:rank=0 pagerank=6.3127e-01 url=www.lawfareblog.com/covid-19-speech-and-surveillance-response
INFO:root:rank=1 pagerank=6.3124e-01 url=www.lawfareblog.com/lawfare-live-covid-19-speech-and-surveillance
INFO:root:rank=2 pagerank=1.5947e-01 url=www.lawfareblog.com/chinatalk-how-party-takes-its-propaganda-global
INFO:root:rank=3 pagerank=9.3360e-02 url=www.lawfareblog.com/trump-cant-reopen-country-over-state-objections
INFO:root:rank=4 pagerank=7.0277e-02 url=www.lawfareblog.com/fault-lines-foreign-policy-quarantined
INFO:root:rank=5 pagerank=6.9713e-02 url=www.lawfareblog.com/lawfare-podcast-mom-and-dad-talk-clinical-trials-pandemic
INFO:root:rank=6 pagerank=6.4944e-02 url=www.lawfareblog.com/limits-world-health-organization
INFO:root:rank=7 pagerank=5.9492e-02 url=www.lawfareblog.com/chinatalk-dispatches-shanghai-beijing-and-hong-kong
INFO:root:rank=8 pagerank=5.1245e-02 url=www.lawfareblog.com/us-moves-dismiss-case-against-company-linked-ira-troll-farm
INFO:root:rank=9 pagerank=5.1245e-02 url=www.lawfareblog.com/livestream-house-armed-services-holds-hearing-national-security-challenges-north-and-south-america

You can see that there are many urls about concepts that are obviously related like "covid", "clinical trials", and "quarantine", but this algorithm also finds articles about Chinese propaganda and Trump's policy decisions. Both of these articles are highly relevant to coronavirus discussions, but a simple keyword search for corona or related terms would not find these articles.

Submission

  1. Create a new repo on github (not a fork of this repo).

  2. Run the following commands, and paste their output into the code blocks below.

    Task 1, part 1:

    $ python3 pagerank.py --data=data/small.csv.gz --verbose
    
     DEBUG:root:computing indices
     DEBUG:root:computing values
     DEBUG:root:i=0 residual=0.2562914192676544
     DEBUG:root:i=1 residual=0.11841227114200592
     DEBUG:root:i=2 residual=0.07070134580135345
     DEBUG:root:i=3 residual=0.03181542828679085
     DEBUG:root:i=4 residual=0.020496590062975883
     DEBUG:root:i=5 residual=0.01010835450142622
     DEBUG:root:i=6 residual=0.006371544674038887
     DEBUG:root:i=7 residual=0.0034227892756462097
     DEBUG:root:i=8 residual=0.002087961183860898
     DEBUG:root:i=9 residual=0.0011749734403565526
     DEBUG:root:i=10 residual=0.0007012754795141518
     DEBUG:root:i=11 residual=0.00040320929838344455
     DEBUG:root:i=12 residual=0.00023798426263965666
     DEBUG:root:i=13 residual=0.00013812065299134701
     DEBUG:root:i=14 residual=8.108324982458726e-05
     DEBUG:root:i=15 residual=4.7269360948121175e-05
     DEBUG:root:i=16 residual=2.7704918466042727e-05
     DEBUG:root:i=17 residual=1.6170568414963782e-05
     DEBUG:root:i=18 residual=9.479118489252869e-06
     DEBUG:root:i=19 residual=5.4782999541203026e-06
     DEBUG:root:i=20 residual=3.2123323308042018e-06
     DEBUG:root:i=21 residual=1.8802053318722756e-06
     DEBUG:root:i=22 residual=1.1228398761886638e-06
     DEBUG:root:i=23 residual=6.322027275018627e-07
     INFO:root:rank=0 pagerank=6.6270e-01 url=4
     INFO:root:rank=1 pagerank=5.2179e-01 url=6
     INFO:root:rank=2 pagerank=4.1434e-01 url=5
     INFO:root:rank=3 pagerank=2.3175e-01 url=2
     INFO:root:rank=4 pagerank=1.8590e-01 url=3
     INFO:root:rank=5 pagerank=1.6917e-01 url=1
    

    Task 1, part 2:

    $ python3 pagerank.py --data=data/lawfareblog.csv.gz --search_query='corona'
     INFO:root:rank=0 pagerank=1.0038e-03 url=www.lawfareblog.com/lawfare-podcast-united-nations-and-coronavirus-crisis
     INFO:root:rank=1 pagerank=8.9224e-04 url=www.lawfareblog.com/house-oversight-committee-holds-day-two-hearing-government-coronavirus-response
     INFO:root:rank=2 pagerank=7.0390e-04 url=www.lawfareblog.com/britains-coronavirus-response
     INFO:root:rank=3 pagerank=6.9153e-04 url=www.lawfareblog.com/prosecuting-purposeful-coronavirus-exposure-terrorism
     INFO:root:rank=4 pagerank=6.7041e-04 url=www.lawfareblog.com/israeli-emergency-regulations-location-tracking-coronavirus-carriers
     INFO:root:rank=5 pagerank=6.6256e-04 url=www.lawfareblog.com/why-congress-conducting-business-usual-face-coronavirus
     INFO:root:rank=6 pagerank=6.5046e-04 url=www.lawfareblog.com/congressional-homeland-security-committees-seek-ways-support-state-federal-responses-coronavirus
     INFO:root:rank=7 pagerank=6.3620e-04 url=www.lawfareblog.com/paper-hearing-experts-debate-digital-contact-tracing-and-coronavirus-privacy-concerns
     INFO:root:rank=8 pagerank=6.1248e-04 url=www.lawfareblog.com/house-subcommittee-voices-concerns-over-us-management-coronavirus
     INFO:root:rank=9 pagerank=6.0187e-04 url=www.lawfareblog.com/livestream-house-oversight-committee-holds-hearing-government-coronavirus-response
    
    
    $ python3 pagerank.py --data=data/lawfareblog.csv.gz --search_query='trump'
     INFO:root:rank=0 pagerank=5.7826e-03 url=www.lawfareblog.com/trump-asks-supreme-court-stay-congressional-subpeona-tax-returns
     INFO:root:rank=1 pagerank=5.2338e-03 url=www.lawfareblog.com/document-trump-revokes-obama-executive-order-counterterrorism-strike-casualty-reporting
     INFO:root:rank=2 pagerank=5.1297e-03 url=www.lawfareblog.com/trump-administrations-worrying-new-policy-israeli-settlements
     INFO:root:rank=3 pagerank=4.6599e-03 url=www.lawfareblog.com/dc-circuit-overrules-district-courts-due-process-ruling-qasim-v-trump
     INFO:root:rank=4 pagerank=4.5934e-03 url=www.lawfareblog.com/donald-trump-and-politically-weaponized-executive-branch
     INFO:root:rank=5 pagerank=4.3071e-03 url=www.lawfareblog.com/how-trumps-approach-middle-east-ignores-past-future-and-human-condition
     INFO:root:rank=6 pagerank=4.0935e-03 url=www.lawfareblog.com/why-trump-cant-buy-greenland
     INFO:root:rank=7 pagerank=3.7591e-03 url=www.lawfareblog.com/oral-argument-summary-qassim-v-trump
     INFO:root:rank=8 pagerank=3.4509e-03 url=www.lawfareblog.com/dc-circuit-court-denies-trump-rehearing-mazars-case
     INFO:root:rank=9 pagerank=3.4484e-03 url=www.lawfareblog.com/second-circuit-rules-mazars-must-hand-over-trump-tax-returns-new-york-prosecutors
    
    
    $ python3 pagerank.py --data=data/lawfareblog.csv.gz --search_query='Iran'
    
     INFO:root:rank=0 pagerank=4.5746e-03 url=www.lawfareblog.com/praise-presidents-iran-tweets
     INFO:root:rank=1 pagerank=4.4174e-03 url=www.lawfareblog.com/how-us-iran-tensions-could-disrupt-iraqs-fragile-peace
     INFO:root:rank=2 pagerank=2.6928e-03 url=www.lawfareblog.com/cyber-command-operational-update-clarifying-june-2019-iran-operation
     INFO:root:rank=3 pagerank=1.9391e-03 url=www.lawfareblog.com/aborted-iran-strike-fine-line-between-necessity-and-revenge
     INFO:root:rank=4 pagerank=1.5452e-03 url=www.lawfareblog.com/parsing-state-departments-letter-use-force-against-iran
     INFO:root:rank=5 pagerank=1.5357e-03 url=www.lawfareblog.com/iranian-hostage-crisis-and-its-effect-american-politics
     INFO:root:rank=6 pagerank=1.5258e-03 url=www.lawfareblog.com/announcing-united-states-and-use-force-against-iran-new-lawfare-e-book
     INFO:root:rank=7 pagerank=1.4221e-03 url=www.lawfareblog.com/us-names-iranian-revolutionary-guard-terrorist-organization-and-sanctions-international-criminal
     INFO:root:rank=8 pagerank=1.1788e-03 url=www.lawfareblog.com/iran-shoots-down-us-drone-domestic-and-international-legal-implications
     INFO:root:rank=9 pagerank=1.1463e-03 url=www.lawfareblog.com/israel-iran-syria-clash-and-law-use-force
    
    

    Task 1, part 3:

    $ python3 pagerank.py --data=data/lawfareblog.csv.gz
     INFO:root:rank=0 pagerank=2.8741e-01 url=www.lawfareblog.com/lawfare-job-board
     INFO:root:rank=1 pagerank=2.8741e-01 url=www.lawfareblog.com/masthead
     INFO:root:rank=2 pagerank=2.8741e-01 url=www.lawfareblog.com/litigation-documents-related-appointment-matthew-whitaker-acting-attorney-general
     INFO:root:rank=3 pagerank=2.8741e-01 url=www.lawfareblog.com/documents-related-mueller-investigation
     INFO:root:rank=4 pagerank=2.8741e-01 url=www.lawfareblog.com/topics
     INFO:root:rank=5 pagerank=2.8741e-01 url=www.lawfareblog.com/about-lawfare-brief-history-term-and-site
     INFO:root:rank=6 pagerank=2.8741e-01 url=www.lawfareblog.com/snowden-revelations
     INFO:root:rank=7 pagerank=2.8741e-01 url=www.lawfareblog.com/support-lawfare
     INFO:root:rank=8 pagerank=2.8741e-01 url=www.lawfareblog.com/upcoming-events
     INFO:root:rank=9 pagerank=2.8741e-01 url=www.lawfareblog.com/our-comments-policy
    
    
    $ python3 pagerank.py --data=data/lawfareblog.csv.gz --filter_ratio=0.2
    
     INFO:root:rank=0 pagerank=3.4696e-01 url=www.lawfareblog.com/trump-asks-supreme-court-stay-congressional-subpeona-tax-returns
     INFO:root:rank=1 pagerank=2.9521e-01 url=www.lawfareblog.com/livestream-nov-21-impeachment-hearings-0
     INFO:root:rank=2 pagerank=2.9040e-01 url=www.lawfareblog.com/opening-statement-david-holmes
     INFO:root:rank=3 pagerank=1.5179e-01 url=www.lawfareblog.com/lawfare-podcast-ben-nimmo-whack-mole-game-disinformation
     INFO:root:rank=4 pagerank=1.5099e-01 url=www.lawfareblog.com/todays-headlines-and-commentary-1963
     INFO:root:rank=5 pagerank=1.5099e-01 url=www.lawfareblog.com/todays-headlines-and-commentary-1964
     INFO:root:rank=6 pagerank=1.5071e-01 url=www.lawfareblog.com/lawfare-podcast-week-was-impeachment
     INFO:root:rank=7 pagerank=1.4957e-01 url=www.lawfareblog.com/todays-headlines-and-commentary-1962
     INFO:root:rank=8 pagerank=1.4367e-01 url=www.lawfareblog.com/cyberlaw-podcast-mistrusting-google
     INFO:root:rank=9 pagerank=1.4240e-01 url=www.lawfareblog.com/lawfare-podcast-bonus-edition-gordon-sondland-vs-committee-no-bull
    

    Task 1, part 4:

    $ python3 pagerank.py --data=data/lawfareblog.csv.gz --verbose 
     DEBUG:root:computing indices
     DEBUG:root:computing values
     DEBUG:root:i=0 residual=1.3793821334838867
     DEBUG:root:i=1 residual=0.11642514914274216
     DEBUG:root:i=2 residual=0.07495073974132538
     DEBUG:root:i=3 residual=0.031712669879198074
     DEBUG:root:i=4 residual=0.01746140420436859
     DEBUG:root:i=5 residual=0.008529536426067352
     DEBUG:root:i=6 residual=0.0044392067939043045
     DEBUG:root:i=7 residual=0.002238823566585779
     DEBUG:root:i=8 residual=0.0011464565759524703
     DEBUG:root:i=9 residual=0.0005798051133751869
     DEBUG:root:i=10 residual=0.00029213540256023407
     DEBUG:root:i=11 residual=0.00014553092478308827
     DEBUG:root:i=12 residual=7.149828888941556e-05
     DEBUG:root:i=13 residual=3.433692472754046e-05
     DEBUG:root:i=14 residual=1.5638515833416022e-05
     DEBUG:root:i=15 residual=6.266389391385019e-06
     DEBUG:root:i=16 residual=2.8251236017240444e-06
     DEBUG:root:i=17 residual=1.3609912912215805e-06
     DEBUG:root:i=18 residual=4.311152963509812e-07
     INFO:root:rank=0 pagerank=2.8741e-01 url=www.lawfareblog.com/lawfare-job-board
     INFO:root:rank=1 pagerank=2.8741e-01 url=www.lawfareblog.com/masthead
     INFO:root:rank=2 pagerank=2.8741e-01 url=www.lawfareblog.com/litigation-documents-related-appointment-matthew-whitaker-acting-attorney-general
     INFO:root:rank=3 pagerank=2.8741e-01 url=www.lawfareblog.com/documents-related-mueller-investigation
     INFO:root:rank=4 pagerank=2.8741e-01 url=www.lawfareblog.com/topics
     INFO:root:rank=5 pagerank=2.8741e-01 url=www.lawfareblog.com/about-lawfare-brief-history-term-and-site
     INFO:root:rank=6 pagerank=2.8741e-01 url=www.lawfareblog.com/snowden-revelations
     INFO:root:rank=7 pagerank=2.8741e-01 url=www.lawfareblog.com/support-lawfare
     INFO:root:rank=8 pagerank=2.8741e-01 url=www.lawfareblog.com/upcoming-events
     INFO:root:rank=9 pagerank=2.8741e-01 url=www.lawfareblog.com/our-comments-policy
    
    $ python3 pagerank.py --data=data/lawfareblog.csv.gz --verbose --alpha=0.99999
    
     DEBUG:root:computing indices
     DEBUG:root:computing values
     DEBUG:root:i=0 residual=1.3845856189727783
     DEBUG:root:i=1 residual=0.07088156789541245
     DEBUG:root:i=2 residual=0.01882227510213852
     DEBUG:root:i=3 residual=0.006958262529224157
     DEBUG:root:i=4 residual=0.0027358194347471
     DEBUG:root:i=5 residual=0.0010345563059672713
     DEBUG:root:i=6 residual=0.0003774634387809783
     DEBUG:root:i=7 residual=0.00013533401943277568
     DEBUG:root:i=8 residual=4.8224112106254324e-05
     DEBUG:root:i=9 residual=1.7172435036627576e-05
     DEBUG:root:i=10 residual=6.118058081483468e-06
     DEBUG:root:i=11 residual=2.173422217310872e-06
     DEBUG:root:i=12 residual=7.82504116614291e-07
     INFO:root:rank=0 pagerank=2.8859e-01 url=www.lawfareblog.com/lawfare-job-board
     INFO:root:rank=1 pagerank=2.8859e-01 url=www.lawfareblog.com/masthead
     INFO:root:rank=2 pagerank=2.8859e-01 url=www.lawfareblog.com/litigation-documents-related-appointment-matthew-whitaker-acting-attorney-general
     INFO:root:rank=3 pagerank=2.8859e-01 url=www.lawfareblog.com/documents-related-mueller-investigation
     INFO:root:rank=4 pagerank=2.8859e-01 url=www.lawfareblog.com/topics
     INFO:root:rank=5 pagerank=2.8859e-01 url=www.lawfareblog.com/about-lawfare-brief-history-term-and-site
     INFO:root:rank=6 pagerank=2.8859e-01 url=www.lawfareblog.com/snowden-revelations
     INFO:root:rank=7 pagerank=2.8859e-01 url=www.lawfareblog.com/support-lawfare
     INFO:root:rank=8 pagerank=2.8859e-01 url=www.lawfareblog.com/upcoming-events
     INFO:root:rank=9 pagerank=2.8859e-01 url=www.lawfareblog.com/our-comments-policy
    
    $ python3 pagerank.py --data=data/lawfareblog.csv.gz --verbose --filter_ratio=0.2
    
     DEBUG:root:computing indices
     DEBUG:root:computing values
     DEBUG:root:i=0 residual=1.2609827518463135
     DEBUG:root:i=1 residual=0.49858558177948
     DEBUG:root:i=2 residual=0.13420072197914124
     DEBUG:root:i=3 residual=0.0692318007349968
     DEBUG:root:i=4 residual=0.023411516100168228
     DEBUG:root:i=5 residual=0.010188630782067776
     DEBUG:root:i=6 residual=0.004910613875836134
     DEBUG:root:i=7 residual=0.002279882086440921
     DEBUG:root:i=8 residual=0.0010739191202446818
     DEBUG:root:i=9 residual=0.0005249588284641504
     DEBUG:root:i=10 residual=0.00026969940518029034
     DEBUG:root:i=11 residual=0.00014575093518942595
     DEBUG:root:i=12 residual=8.241771865868941e-05
     DEBUG:root:i=13 residual=4.819605601369403e-05
     DEBUG:root:i=14 residual=2.881278851418756e-05
     DEBUG:root:i=15 residual=1.7407368432031944e-05
     DEBUG:root:i=16 residual=1.0556699635344557e-05
     DEBUG:root:i=17 residual=6.386945642589126e-06
     DEBUG:root:i=18 residual=3.835165898635751e-06
     DEBUG:root:i=19 residual=2.295828835485736e-06
     DEBUG:root:i=20 residual=1.3655146631208481e-06
     DEBUG:root:i=21 residual=8.102273341137334e-07
     INFO:root:rank=0 pagerank=3.4696e-01 url=www.lawfareblog.com/trump-asks-supreme-court-stay-congressional-subpeona-tax-returns
     INFO:root:rank=1 pagerank=2.9521e-01 url=www.lawfareblog.com/livestream-nov-21-impeachment-hearings-0
     INFO:root:rank=2 pagerank=2.9040e-01 url=www.lawfareblog.com/opening-statement-david-holmes
     INFO:root:rank=3 pagerank=1.5179e-01 url=www.lawfareblog.com/lawfare-podcast-ben-nimmo-whack-mole-game-disinformation
     INFO:root:rank=4 pagerank=1.5099e-01 url=www.lawfareblog.com/todays-headlines-and-commentary-1963
     INFO:root:rank=5 pagerank=1.5099e-01 url=www.lawfareblog.com/todays-headlines-and-commentary-1964
     INFO:root:rank=6 pagerank=1.5071e-01 url=www.lawfareblog.com/lawfare-podcast-week-was-impeachment
     INFO:root:rank=7 pagerank=1.4957e-01 url=www.lawfareblog.com/todays-headlines-and-commentary-1962
     INFO:root:rank=8 pagerank=1.4367e-01 url=www.lawfareblog.com/cyberlaw-podcast-mistrusting-google
     INFO:root:rank=9 pagerank=1.4240e-01 url=www.lawfareblog.com/lawfare-podcast-bonus-edition-gordon-sondland-vs-committee-no-bull
    
    
    $ python3 pagerank.py --data=data/lawfareblog.csv.gz --verbose --filter_ratio=0.2 --alpha=0.99999
    
         DEBUG:root:computing indices
         DEBUG:root:computing values
         DEBUG:root:i=0 residual=1.2827345132827759
         DEBUG:root:i=1 residual=0.5695679783821106
         DEBUG:root:i=2 residual=0.38298743963241577
         DEBUG:root:i=3 residual=0.2173907607793808
         DEBUG:root:i=4 residual=0.14045004546642303
         DEBUG:root:i=5 residual=0.10851320624351501
         DEBUG:root:i=6 residual=0.09284130483865738
         DEBUG:root:i=7 residual=0.08225572854280472
         DEBUG:root:i=8 residual=0.07338864356279373
         DEBUG:root:i=9 residual=0.06561222672462463
         DEBUG:root:i=10 residual=0.059096433222293854
         DEBUG:root:i=11 residual=0.0541754886507988
         DEBUG:root:i=12 residual=0.05111689865589142
         DEBUG:root:i=13 residual=0.04999383166432381
         DEBUG:root:i=14 residual=0.05060902237892151
         DEBUG:root:i=15 residual=0.0525263175368309
         DEBUG:root:i=16 residual=0.0551888681948185
         DEBUG:root:i=17 residual=0.05803852900862694
         DEBUG:root:i=18 residual=0.06059221178293228
         DEBUG:root:i=19 residual=0.0624784380197525
         DEBUG:root:i=20 residual=0.06345319747924805
         DEBUG:root:i=21 residual=0.06340522319078445
         DEBUG:root:i=22 residual=0.06234566867351532
         DEBUG:root:i=23 residual=0.06038395315408707
         DEBUG:root:i=24 residual=0.05769402161240578
         DEBUG:root:i=25 residual=0.05447976291179657
         DEBUG:root:i=26 residual=0.05094277113676071
         DEBUG:root:i=27 residual=0.0472610704600811
         DEBUG:root:i=28 residual=0.043578535318374634
         DEBUG:root:i=29 residual=0.04000161215662956
         DEBUG:root:i=30 residual=0.03660248592495918
         DEBUG:root:i=31 residual=0.033424295485019684
         DEBUG:root:i=32 residual=0.030489472672343254
         DEBUG:root:i=33 residual=0.02780323475599289
         DEBUG:root:i=34 residual=0.025360697880387306
         DEBUG:root:i=35 residual=0.023150015622377396
         DEBUG:root:i=36 residual=0.02115531638264656
         DEBUG:root:i=37 residual=0.019359197467565536
         DEBUG:root:i=38 residual=0.01774349994957447
         DEBUG:root:i=39 residual=0.016290761530399323
         DEBUG:root:i=40 residual=0.0149843106046319
         DEBUG:root:i=41 residual=0.01380856242030859
         DEBUG:root:i=42 residual=0.012749706394970417
         DEBUG:root:i=43 residual=0.011794854886829853
         DEBUG:root:i=44 residual=0.010932852514088154
         DEBUG:root:i=45 residual=0.010153341107070446
         DEBUG:root:i=46 residual=0.009447379969060421
         DEBUG:root:i=47 residual=0.008807037957012653
         DEBUG:root:i=48 residual=0.00822509080171585
         DEBUG:root:i=49 residual=0.007695489563047886
         DEBUG:root:i=50 residual=0.007212585769593716
         DEBUG:root:i=51 residual=0.00677154865115881
         DEBUG:root:i=52 residual=0.006367980502545834
         DEBUG:root:i=53 residual=0.005998112726956606
         DEBUG:root:i=54 residual=0.005658607929944992
         DEBUG:root:i=55 residual=0.005346357356756926
         DEBUG:root:i=56 residual=0.005058770999312401
         DEBUG:root:i=57 residual=0.004793448373675346
         DEBUG:root:i=58 residual=0.004548261873424053
         DEBUG:root:i=59 residual=0.004321318585425615
         DEBUG:root:i=60 residual=0.004110995680093765
         DEBUG:root:i=61 residual=0.003915745764970779
         DEBUG:root:i=62 residual=0.0037342202849686146
         DEBUG:root:i=63 residual=0.0035651903599500656
         DEBUG:root:i=64 residual=0.0034076401498168707
         DEBUG:root:i=65 residual=0.0032605668529868126
         DEBUG:root:i=66 residual=0.0031230042222887278
         DEBUG:root:i=67 residual=0.0029942821711301804
         DEBUG:root:i=68 residual=0.0028736344538629055
         DEBUG:root:i=69 residual=0.002760394709184766
         DEBUG:root:i=70 residual=0.0026539983227849007
         DEBUG:root:i=71 residual=0.002553925383836031
         DEBUG:root:i=72 residual=0.0024596548173576593
         DEBUG:root:i=73 residual=0.0023707826621830463
         DEBUG:root:i=74 residual=0.0022869009990245104
         DEBUG:root:i=75 residual=0.0022076410241425037
         DEBUG:root:i=76 residual=0.0021326374262571335
         DEBUG:root:i=77 residual=0.0020616468973457813
         DEBUG:root:i=78 residual=0.0019943546503782272
         DEBUG:root:i=79 residual=0.001930529368110001
         DEBUG:root:i=80 residual=0.0018698793137446046
         DEBUG:root:i=81 residual=0.0018122568726539612
         DEBUG:root:i=82 residual=0.0017574297962710261
         DEBUG:root:i=83 residual=0.001705214031971991
         DEBUG:root:i=84 residual=0.0016554755857214332
         DEBUG:root:i=85 residual=0.0016080320347100496
         DEBUG:root:i=86 residual=0.0015627194661647081
         DEBUG:root:i=87 residual=0.001519430079497397
         DEBUG:root:i=88 residual=0.0014780594501644373
         DEBUG:root:i=89 residual=0.0014384739333763719
         DEBUG:root:i=90 residual=0.0014005572302266955
         DEBUG:root:i=91 residual=0.0013642337871715426
         DEBUG:root:i=92 residual=0.0013293965021148324
         DEBUG:root:i=93 residual=0.0012959778541699052
         DEBUG:root:i=94 residual=0.001263876329176128
         DEBUG:root:i=95 residual=0.0012330460594967008
         DEBUG:root:i=96 residual=0.0012033917009830475
         DEBUG:root:i=97 residual=0.001174872275441885
         DEBUG:root:i=98 residual=0.0011474145576357841
         DEBUG:root:i=99 residual=0.0011209737276658416
         DEBUG:root:i=100 residual=0.0010954836616292596
         DEBUG:root:i=101 residual=0.0010709064081311226
         DEBUG:root:i=102 residual=0.0010471936548128724
         DEBUG:root:i=103 residual=0.0010242975549772382
         DEBUG:root:i=104 residual=0.00100218434818089
         DEBUG:root:i=105 residual=0.00098081910982728
         DEBUG:root:i=106 residual=0.0009601557976566255
         DEBUG:root:i=107 residual=0.0009401643765158951
         DEBUG:root:i=108 residual=0.0009208175470121205
         DEBUG:root:i=109 residual=0.0009020979050546885
         DEBUG:root:i=110 residual=0.0008839573129080236
         DEBUG:root:i=111 residual=0.000866368121933192
         DEBUG:root:i=112 residual=0.0008493199711665511
         DEBUG:root:i=113 residual=0.0008327721152454615
         DEBUG:root:i=114 residual=0.000816727289929986
         DEBUG:root:i=115 residual=0.0008011464960873127
         DEBUG:root:i=116 residual=0.0007860094192437828
         DEBUG:root:i=117 residual=0.0007713042432442307
         DEBUG:root:i=118 residual=0.0007570049492642283
         DEBUG:root:i=119 residual=0.0007431192207150161
         DEBUG:root:i=120 residual=0.0007295997929759324
         DEBUG:root:i=121 residual=0.0007164512644521892
         DEBUG:root:i=122 residual=0.0007036454626359046
         DEBUG:root:i=123 residual=0.0006911859381943941
         DEBUG:root:i=124 residual=0.0006790339248254895
         DEBUG:root:i=125 residual=0.0006672092131339014
         DEBUG:root:i=126 residual=0.0006556769367307425
         DEBUG:root:i=127 residual=0.000644430925603956
         DEBUG:root:i=128 residual=0.0006334679783321917
         DEBUG:root:i=129 residual=0.0006227655103430152
         DEBUG:root:i=130 residual=0.0006123287603259087
         DEBUG:root:i=131 residual=0.0006021387525834143
         DEBUG:root:i=132 residual=0.0005921924021095037
         DEBUG:root:i=133 residual=0.0005824716645292938
         DEBUG:root:i=134 residual=0.0005729826516471803
         DEBUG:root:i=135 residual=0.0005637070862576365
         DEBUG:root:i=136 residual=0.0005546381580643356
         DEBUG:root:i=137 residual=0.000545780174434185
         DEBUG:root:i=138 residual=0.0005371167790144682
         DEBUG:root:i=139 residual=0.0005286391242407262
         DEBUG:root:i=140 residual=0.0005203530308790505
         DEBUG:root:i=141 residual=0.0005122329457663
         DEBUG:root:i=142 residual=0.0005042985430918634
         DEBUG:root:i=143 residual=0.0004965272382833064
         DEBUG:root:i=144 residual=0.0004889093688689172
         DEBUG:root:i=145 residual=0.00048145302571356297
         DEBUG:root:i=146 residual=0.0004741534066852182
         DEBUG:root:i=147 residual=0.00046700096572749317
         DEBUG:root:i=148 residual=0.0004599933745339513
         DEBUG:root:i=149 residual=0.00045312679139897227
         DEBUG:root:i=150 residual=0.0004463912337087095
         DEBUG:root:i=151 residual=0.00043978646863251925
         DEBUG:root:i=152 residual=0.00043331467895768583
         DEBUG:root:i=153 residual=0.00042697106255218387
         DEBUG:root:i=154 residual=0.0004207405145280063
         DEBUG:root:i=155 residual=0.00041463071829639375
         DEBUG:root:i=156 residual=0.0004086443514097482
         DEBUG:root:i=157 residual=0.00040276054642163217
         DEBUG:root:i=158 residual=0.00039698302862234414
         DEBUG:root:i=159 residual=0.0003913186374120414
         DEBUG:root:i=160 residual=0.0003857545671053231
         DEBUG:root:i=161 residual=0.00038029320421628654
         DEBUG:root:i=162 residual=0.0003749268944375217
         DEBUG:root:i=163 residual=0.00036965610343031585
         DEBUG:root:i=164 residual=0.0003644881653599441
         DEBUG:root:i=165 residual=0.0003593963338062167
         DEBUG:root:i=166 residual=0.00035439812927506864
         DEBUG:root:i=167 residual=0.00034949349355883896
         DEBUG:root:i=168 residual=0.00034466685610823333
         DEBUG:root:i=169 residual=0.00033993145916610956
         DEBUG:root:i=170 residual=0.00033526966581121087
         DEBUG:root:i=171 residual=0.0003306938160676509
         DEBUG:root:i=172 residual=0.00032618685509078205
         DEBUG:root:i=173 residual=0.00032175728119909763
         DEBUG:root:i=174 residual=0.000317403202643618
         DEBUG:root:i=175 residual=0.00031312249484471977
         DEBUG:root:i=176 residual=0.0003089130623266101
         DEBUG:root:i=177 residual=0.00030476629035547376
         DEBUG:root:i=178 residual=0.00030069483909755945
         DEBUG:root:i=179 residual=0.00029668453498743474
         DEBUG:root:i=180 residual=0.0002927415771409869
         DEBUG:root:i=181 residual=0.0002888553717639297
         DEBUG:root:i=182 residual=0.0002850376768037677
         DEBUG:root:i=183 residual=0.0002812762977555394
         DEBUG:root:i=184 residual=0.0002775781322270632
         DEBUG:root:i=185 residual=0.0002739399787969887
         DEBUG:root:i=186 residual=0.0002703559584915638
         DEBUG:root:i=187 residual=0.0002668243832886219
         DEBUG:root:i=188 residual=0.0002633541589602828
         DEBUG:root:i=189 residual=0.00025993023882620037
         DEBUG:root:i=190 residual=0.0002565619070082903
         DEBUG:root:i=191 residual=0.00025324535090476274
         DEBUG:root:i=192 residual=0.0002499781840015203
         DEBUG:root:i=193 residual=0.0002467619488015771
         DEBUG:root:i=194 residual=0.00024359324015676975
         DEBUG:root:i=195 residual=0.00024047000624705106
         DEBUG:root:i=196 residual=0.00023739771859254688
         DEBUG:root:i=197 residual=0.00023436793708242476
         DEBUG:root:i=198 residual=0.00023138622054830194
         DEBUG:root:i=199 residual=0.00022844658815301955
         DEBUG:root:i=200 residual=0.0002255516592413187
         DEBUG:root:i=201 residual=0.00022269775217864662
         DEBUG:root:i=202 residual=0.00021988390653859824
         DEBUG:root:i=203 residual=0.00021711080626118928
         DEBUG:root:i=204 residual=0.00021437922259792686
         DEBUG:root:i=205 residual=0.000211689795833081
         DEBUG:root:i=206 residual=0.00020903782569803298
         DEBUG:root:i=207 residual=0.00020642278832383454
         DEBUG:root:i=208 residual=0.00020384354866109788
         DEBUG:root:i=209 residual=0.00020130380289629102
         DEBUG:root:i=210 residual=0.00019879733736161143
         DEBUG:root:i=211 residual=0.0001963248651009053
         DEBUG:root:i=212 residual=0.00019388859800528735
         DEBUG:root:i=213 residual=0.0001914903405122459
         DEBUG:root:i=214 residual=0.00018912261293735355
         DEBUG:root:i=215 residual=0.00018678774358704686
         DEBUG:root:i=216 residual=0.0001844896760303527
         DEBUG:root:i=217 residual=0.00018221828213427216
         DEBUG:root:i=218 residual=0.00017998278781305999
         DEBUG:root:i=219 residual=0.00017777625180315226
         DEBUG:root:i=220 residual=0.00017559819389134645
         DEBUG:root:i=221 residual=0.00017345235391985625
         DEBUG:root:i=222 residual=0.00017133599612861872
         DEBUG:root:i=223 residual=0.0001692451915005222
         DEBUG:root:i=224 residual=0.00016718772531021386
         DEBUG:root:i=225 residual=0.00016515505558345467
         DEBUG:root:i=226 residual=0.0001631500053917989
         DEBUG:root:i=227 residual=0.00016117207997012883
         DEBUG:root:i=228 residual=0.00015922266175039113
         DEBUG:root:i=229 residual=0.00015729584265500307
         DEBUG:root:i=230 residual=0.00015539934975095093
         DEBUG:root:i=231 residual=0.00015352752234321088
         DEBUG:root:i=232 residual=0.000151676926179789
         DEBUG:root:i=233 residual=0.00014985409507062286
         DEBUG:root:i=234 residual=0.000148051927681081
         DEBUG:root:i=235 residual=0.00014627586642745882
         DEBUG:root:i=236 residual=0.00014452151663135737
         DEBUG:root:i=237 residual=0.0001427954703103751
         DEBUG:root:i=238 residual=0.00014108797768130898
         DEBUG:root:i=239 residual=0.00013940119242761284
         DEBUG:root:i=240 residual=0.00013773904356639832
         DEBUG:root:i=241 residual=0.00013610039604827762
         DEBUG:root:i=242 residual=0.00013448078243527561
         DEBUG:root:i=243 residual=0.0001328849175479263
         DEBUG:root:i=244 residual=0.00013130728621035814
         DEBUG:root:i=245 residual=0.00012974858691450208
         DEBUG:root:i=246 residual=0.00012821114796679467
         DEBUG:root:i=247 residual=0.00012669485295191407
         DEBUG:root:i=248 residual=0.00012519625306595117
         DEBUG:root:i=249 residual=0.00012371878256089985
         DEBUG:root:i=250 residual=0.000122259501949884
         DEBUG:root:i=251 residual=0.00012081880413461477
         DEBUG:root:i=252 residual=0.00011939798423554748
         DEBUG:root:i=253 residual=0.00011799210187746212
         DEBUG:root:i=254 residual=0.00011660611198749393
         DEBUG:root:i=255 residual=0.00011523616558406502
         DEBUG:root:i=256 residual=0.00011388628627173603
         DEBUG:root:i=257 residual=0.00011255133722443134
         DEBUG:root:i=258 residual=0.00011123330477857962
         DEBUG:root:i=259 residual=0.0001099329165299423
         DEBUG:root:i=260 residual=0.0001086478732759133
         DEBUG:root:i=261 residual=0.00010738023411249742
         DEBUG:root:i=262 residual=0.00010612714686430991
         DEBUG:root:i=263 residual=0.00010488892439752817
         DEBUG:root:i=264 residual=0.00010366823698859662
         DEBUG:root:i=265 residual=0.00010246112651657313
         DEBUG:root:i=266 residual=0.00010126895358553156
         DEBUG:root:i=267 residual=0.00010009211837314069
         DEBUG:root:i=268 residual=9.892899106489494e-05
         DEBUG:root:i=269 residual=9.778246749192476e-05
         DEBUG:root:i=270 residual=9.665037941886112e-05
         DEBUG:root:i=271 residual=9.553277777740732e-05
         DEBUG:root:i=272 residual=9.442564623896033e-05
         DEBUG:root:i=273 residual=9.333469643024728e-05
         DEBUG:root:i=274 residual=9.225682151736692e-05
         DEBUG:root:i=275 residual=9.119172318605706e-05
         DEBUG:root:i=276 residual=9.013924864120781e-05
         DEBUG:root:i=277 residual=8.910078759072348e-05
         DEBUG:root:i=278 residual=8.807330596027896e-05
         DEBUG:root:i=279 residual=8.706083463039249e-05
         DEBUG:root:i=280 residual=8.605887705925852e-05
         DEBUG:root:i=281 residual=8.507017628289759e-05
         DEBUG:root:i=282 residual=8.409364090766758e-05
         DEBUG:root:i=283 residual=8.312944555655122e-05
         DEBUG:root:i=284 residual=8.217340655392036e-05
         DEBUG:root:i=285 residual=8.12315265648067e-05
         DEBUG:root:i=286 residual=8.03016300778836e-05
         DEBUG:root:i=287 residual=7.93828148744069e-05
         DEBUG:root:i=288 residual=7.847524102544412e-05
         DEBUG:root:i=289 residual=7.757759158266708e-05
         DEBUG:root:i=290 residual=7.669065962545574e-05
         DEBUG:root:i=291 residual=7.581511454191059e-05
         DEBUG:root:i=292 residual=7.495055615436286e-05
         DEBUG:root:i=293 residual=7.409635873045772e-05
         DEBUG:root:i=294 residual=7.325167825911194e-05
         DEBUG:root:i=295 residual=7.241672574309632e-05
         DEBUG:root:i=296 residual=7.159197411965579e-05
         DEBUG:root:i=297 residual=7.077823829604313e-05
         DEBUG:root:i=298 residual=6.997418677201495e-05
         DEBUG:root:i=299 residual=6.917937571415678e-05
         DEBUG:root:i=300 residual=6.839257548563182e-05
         DEBUG:root:i=301 residual=6.76158961141482e-05
         DEBUG:root:i=302 residual=6.68485663481988e-05
         DEBUG:root:i=303 residual=6.609157571801916e-05
         DEBUG:root:i=304 residual=6.534181738970801e-05
         DEBUG:root:i=305 residual=6.4601976191625e-05
         DEBUG:root:i=306 residual=6.387125904439017e-05
         DEBUG:root:i=307 residual=6.314684287644923e-05
         DEBUG:root:i=308 residual=6.24311578576453e-05
         DEBUG:root:i=309 residual=6.172778375912458e-05
         DEBUG:root:i=310 residual=6.102908810134977e-05
         DEBUG:root:i=311 residual=6.034163379808888e-05
         DEBUG:root:i=312 residual=5.9659447288140655e-05
         DEBUG:root:i=313 residual=5.8985835494240746e-05
         DEBUG:root:i=314 residual=5.832106762682088e-05
         DEBUG:root:i=315 residual=5.7661956816446036e-05
         DEBUG:root:i=316 residual=5.7010725868167356e-05
         DEBUG:root:i=317 residual=5.636864443658851e-05
         DEBUG:root:i=318 residual=5.573602902586572e-05
         DEBUG:root:i=319 residual=5.510967821464874e-05
         DEBUG:root:i=320 residual=5.4487798479385674e-05
         DEBUG:root:i=321 residual=5.387514102039859e-05
         DEBUG:root:i=322 residual=5.3271356591722e-05
         DEBUG:root:i=323 residual=5.2671115554403514e-05
         DEBUG:root:i=324 residual=5.2080988098168746e-05
         DEBUG:root:i=325 residual=5.1494589570211247e-05
         DEBUG:root:i=326 residual=5.091816638014279e-05
         DEBUG:root:i=327 residual=5.034670903114602e-05
         DEBUG:root:i=328 residual=4.978231663699262e-05
         DEBUG:root:i=329 residual=4.9225069233216345e-05
         DEBUG:root:i=330 residual=4.867307870881632e-05
         DEBUG:root:i=331 residual=4.8128153139259666e-05
         DEBUG:root:i=332 residual=4.7589681344106793e-05
         DEBUG:root:i=333 residual=4.705690298578702e-05
         DEBUG:root:i=334 residual=4.6531040425179526e-05
         DEBUG:root:i=335 residual=4.6010190999368206e-05
         DEBUG:root:i=336 residual=4.5495846279663965e-05
         DEBUG:root:i=337 residual=4.4988344598095864e-05
         DEBUG:root:i=338 residual=4.4486005208455026e-05
         DEBUG:root:i=339 residual=4.398934470373206e-05
         DEBUG:root:i=340 residual=4.349655864643864e-05
         DEBUG:root:i=341 residual=4.3013409595005214e-05
         DEBUG:root:i=342 residual=4.253243969287723e-05
         DEBUG:root:i=343 residual=4.205781806376763e-05
         DEBUG:root:i=344 residual=4.15892900491599e-05
         DEBUG:root:i=345 residual=4.1126848373096436e-05
         DEBUG:root:i=346 residual=4.06678591389209e-05
         DEBUG:root:i=347 residual=4.0215432818513364e-05
         DEBUG:root:i=348 residual=3.9766906411387026e-05
         DEBUG:root:i=349 residual=3.932619438273832e-05
         DEBUG:root:i=350 residual=3.888847277266905e-05
         DEBUG:root:i=351 residual=3.845540777547285e-05
         DEBUG:root:i=352 residual=3.8027792470529675e-05
         DEBUG:root:i=353 residual=3.7605612305924296e-05
         DEBUG:root:i=354 residual=3.71875285054557e-05
         DEBUG:root:i=355 residual=3.6773577448911965e-05
         DEBUG:root:i=356 residual=3.636644396465272e-05
         DEBUG:root:i=357 residual=3.596319584175944e-05
         DEBUG:root:i=358 residual=3.556347655830905e-05
         DEBUG:root:i=359 residual=3.516770084388554e-05
         DEBUG:root:i=360 residual=3.477870632195845e-05
         DEBUG:root:i=361 residual=3.439400461502373e-05
         DEBUG:root:i=362 residual=3.401244612177834e-05
         DEBUG:root:i=363 residual=3.363522046129219e-05
         DEBUG:root:i=364 residual=3.3263237128267065e-05
         DEBUG:root:i=365 residual=3.289396772743203e-05
         DEBUG:root:i=366 residual=3.2530784665141255e-05
         DEBUG:root:i=367 residual=3.2169988116947934e-05
         DEBUG:root:i=368 residual=3.181412466801703e-05
         DEBUG:root:i=369 residual=3.1461087928619236e-05
         DEBUG:root:i=370 residual=3.1113737350096926e-05
         DEBUG:root:i=371 residual=3.076847497140989e-05
         DEBUG:root:i=372 residual=3.0428151148953475e-05
         DEBUG:root:i=373 residual=3.009144165844191e-05
         DEBUG:root:i=374 residual=2.975965617224574e-05
         DEBUG:root:i=375 residual=2.94317833322566e-05
         DEBUG:root:i=376 residual=2.9105282010277733e-05
         DEBUG:root:i=377 residual=2.87856728391489e-05
         DEBUG:root:i=378 residual=2.8467313313740306e-05
         DEBUG:root:i=379 residual=2.8152218874311075e-05
         DEBUG:root:i=380 residual=2.7842044801218435e-05
         DEBUG:root:i=381 residual=2.75348993454827e-05
         DEBUG:root:i=382 residual=2.7230718842474744e-05
         DEBUG:root:i=383 residual=2.6931273168884218e-05
         DEBUG:root:i=384 residual=2.663364284671843e-05
         DEBUG:root:i=385 residual=2.6341316697653383e-05
         DEBUG:root:i=386 residual=2.6050645828945562e-05
         DEBUG:root:i=387 residual=2.5763265512068756e-05
         DEBUG:root:i=388 residual=2.5478786483290605e-05
         DEBUG:root:i=389 residual=2.5198274670401588e-05
         DEBUG:root:i=390 residual=2.4922792363213375e-05
         DEBUG:root:i=391 residual=2.464870340190828e-05
         DEBUG:root:i=392 residual=2.437617331452202e-05
         DEBUG:root:i=393 residual=2.4108247089316137e-05
         DEBUG:root:i=394 residual=2.3842987502575852e-05
         DEBUG:root:i=395 residual=2.3580560082336888e-05
         DEBUG:root:i=396 residual=2.332206531718839e-05
         DEBUG:root:i=397 residual=2.3064507331582718e-05
         DEBUG:root:i=398 residual=2.281130764458794e-05
         DEBUG:root:i=399 residual=2.256016159662977e-05
         DEBUG:root:i=400 residual=2.2313877707347274e-05
         DEBUG:root:i=401 residual=2.2066791643737815e-05
         DEBUG:root:i=402 residual=2.1823971110279672e-05
         DEBUG:root:i=403 residual=2.1585472495644353e-05
         DEBUG:root:i=404 residual=2.1347377696656622e-05
         DEBUG:root:i=405 residual=2.111458161380142e-05
         DEBUG:root:i=406 residual=2.0882851458736695e-05
         DEBUG:root:i=407 residual=2.065289118036162e-05
         DEBUG:root:i=408 residual=2.042709274974186e-05
         DEBUG:root:i=409 residual=2.02011924557155e-05
         DEBUG:root:i=410 residual=1.9979521312052384e-05
         DEBUG:root:i=411 residual=1.9761322619160637e-05
         DEBUG:root:i=412 residual=1.954420076799579e-05
         DEBUG:root:i=413 residual=1.9329343558638357e-05
         DEBUG:root:i=414 residual=1.9118924683425575e-05
         DEBUG:root:i=415 residual=1.8909277059719898e-05
         DEBUG:root:i=416 residual=1.8700478904065676e-05
         DEBUG:root:i=417 residual=1.849528052844107e-05
         DEBUG:root:i=418 residual=1.829484426707495e-05
         DEBUG:root:i=419 residual=1.809328568924684e-05
         DEBUG:root:i=420 residual=1.7895181372296065e-05
         DEBUG:root:i=421 residual=1.7699003365123644e-05
         DEBUG:root:i=422 residual=1.7504647985333577e-05
         DEBUG:root:i=423 residual=1.731540760374628e-05
         DEBUG:root:i=424 residual=1.7124788428191096e-05
         DEBUG:root:i=425 residual=1.6935324310907163e-05
         DEBUG:root:i=426 residual=1.675023668212816e-05
         DEBUG:root:i=427 residual=1.6569147192058153e-05
         DEBUG:root:i=428 residual=1.638637695577927e-05
         DEBUG:root:i=429 residual=1.620819602976553e-05
         DEBUG:root:i=430 residual=1.602952579560224e-05
         DEBUG:root:i=431 residual=1.5855599485803396e-05
         DEBUG:root:i=432 residual=1.568147672514897e-05
         DEBUG:root:i=433 residual=1.5508729120483622e-05
         DEBUG:root:i=434 residual=1.5339463061536662e-05
         DEBUG:root:i=435 residual=1.517303280706983e-05
         DEBUG:root:i=436 residual=1.5007686670287512e-05
         DEBUG:root:i=437 residual=1.484225140302442e-05
         DEBUG:root:i=438 residual=1.4680577805847861e-05
         DEBUG:root:i=439 residual=1.4519720025418792e-05
         DEBUG:root:i=440 residual=1.4362240108312108e-05
         DEBUG:root:i=441 residual=1.4204974831955042e-05
         DEBUG:root:i=442 residual=1.4047832337382715e-05
         DEBUG:root:i=443 residual=1.3894668882130645e-05
         DEBUG:root:i=444 residual=1.3743452655035071e-05
         DEBUG:root:i=445 residual=1.3591749848274048e-05
         DEBUG:root:i=446 residual=1.3444052456179634e-05
         DEBUG:root:i=447 residual=1.3298010344442446e-05
         DEBUG:root:i=448 residual=1.3153634426998906e-05
         DEBUG:root:i=449 residual=1.3009213944314979e-05
         DEBUG:root:i=450 residual=1.2866457836935297e-05
         DEBUG:root:i=451 residual=1.2725419765047263e-05
         DEBUG:root:i=452 residual=1.2587205674208235e-05
         DEBUG:root:i=453 residual=1.2453631825337652e-05
         DEBUG:root:i=454 residual=1.231362512044143e-05
         DEBUG:root:i=455 residual=1.2179385521449149e-05
         DEBUG:root:i=456 residual=1.20468548630015e-05
         DEBUG:root:i=457 residual=1.1915713912458159e-05
         DEBUG:root:i=458 residual=1.1784925845859107e-05
         DEBUG:root:i=459 residual=1.1657190952973906e-05
         DEBUG:root:i=460 residual=1.1530019946803804e-05
         DEBUG:root:i=461 residual=1.140332460636273e-05
         DEBUG:root:i=462 residual=1.1279984391876496e-05
         DEBUG:root:i=463 residual=1.1156753316754475e-05
         DEBUG:root:i=464 residual=1.1035427633032668e-05
         DEBUG:root:i=465 residual=1.0914944141404703e-05
         DEBUG:root:i=466 residual=1.079639514500741e-05
         DEBUG:root:i=467 residual=1.067899302142905e-05
         DEBUG:root:i=468 residual=1.0561459930613637e-05
         DEBUG:root:i=469 residual=1.0448122338857502e-05
         DEBUG:root:i=470 residual=1.03326474345522e-05
         DEBUG:root:i=471 residual=1.0220426702289842e-05
         DEBUG:root:i=472 residual=1.0109519280376844e-05
         DEBUG:root:i=473 residual=9.997517736337613e-06
         DEBUG:root:i=474 residual=9.890190085570794e-06
         DEBUG:root:i=475 residual=9.782183951756451e-06
         DEBUG:root:i=476 residual=9.67624055192573e-06
         DEBUG:root:i=477 residual=9.57028260017978e-06
         DEBUG:root:i=478 residual=9.466019946557935e-06
         DEBUG:root:i=479 residual=9.366169251734391e-06
         DEBUG:root:i=480 residual=9.25991298572626e-06
         DEBUG:root:i=481 residual=9.161411981040146e-06
         DEBUG:root:i=482 residual=9.061219316208735e-06
         DEBUG:root:i=483 residual=8.962610991147812e-06
         DEBUG:root:i=484 residual=8.864939445629716e-06
         DEBUG:root:i=485 residual=8.76844842423452e-06
         DEBUG:root:i=486 residual=8.672359399497509e-06
         DEBUG:root:i=487 residual=8.578732376918197e-06
         DEBUG:root:i=488 residual=8.484119462082162e-06
         DEBUG:root:i=489 residual=8.393281859753188e-06
         DEBUG:root:i=490 residual=8.3017703218502e-06
         DEBUG:root:i=491 residual=8.213298315240536e-06
         DEBUG:root:i=492 residual=8.122345207084436e-06
         DEBUG:root:i=493 residual=8.0337331382907e-06
         DEBUG:root:i=494 residual=7.945382094476372e-06
         DEBUG:root:i=495 residual=7.859487595851533e-06
         DEBUG:root:i=496 residual=7.778921826684382e-06
         DEBUG:root:i=497 residual=7.689205631322693e-06
         DEBUG:root:i=498 residual=7.606872259202646e-06
         DEBUG:root:i=499 residual=7.523688509536441e-06
         DEBUG:root:i=500 residual=7.44110684536281e-06
         DEBUG:root:i=501 residual=7.3615087785583455e-06
         DEBUG:root:i=502 residual=7.281331818376202e-06
         DEBUG:root:i=503 residual=7.203375389508437e-06
         DEBUG:root:i=504 residual=7.123986051738029e-06
         DEBUG:root:i=505 residual=7.046387963782763e-06
         DEBUG:root:i=506 residual=6.969655260036234e-06
         DEBUG:root:i=507 residual=6.894273610669188e-06
         DEBUG:root:i=508 residual=6.819269856350729e-06
         DEBUG:root:i=509 residual=6.744757229171228e-06
         DEBUG:root:i=510 residual=6.670717993983999e-06
         DEBUG:root:i=511 residual=6.5988883761747275e-06
         DEBUG:root:i=512 residual=6.529469374072505e-06
         DEBUG:root:i=513 residual=6.458086318161804e-06
         DEBUG:root:i=514 residual=6.386462700902484e-06
         DEBUG:root:i=515 residual=6.318886335066054e-06
         DEBUG:root:i=516 residual=6.249180387385422e-06
         DEBUG:root:i=517 residual=6.1828595789847896e-06
         DEBUG:root:i=518 residual=6.1144451137806755e-06
         DEBUG:root:i=519 residual=6.049062903912272e-06
         DEBUG:root:i=520 residual=5.9809635786223225e-06
         DEBUG:root:i=521 residual=5.917106591368793e-06
         DEBUG:root:i=522 residual=5.8539631027088035e-06
         DEBUG:root:i=523 residual=5.788610906165559e-06
         DEBUG:root:i=524 residual=5.726822564611211e-06
         DEBUG:root:i=525 residual=5.663331194227794e-06
         DEBUG:root:i=526 residual=5.602377314062323e-06
         DEBUG:root:i=527 residual=5.543129191210028e-06
         DEBUG:root:i=528 residual=5.481153493747115e-06
         DEBUG:root:i=529 residual=5.422243248176528e-06
         DEBUG:root:i=530 residual=5.363051968743093e-06
         DEBUG:root:i=531 residual=5.304979367792839e-06
         DEBUG:root:i=532 residual=5.2472532843239605e-06
         DEBUG:root:i=533 residual=5.190038791624829e-06
         DEBUG:root:i=534 residual=5.135286301083397e-06
         DEBUG:root:i=535 residual=5.078846697870176e-06
         DEBUG:root:i=536 residual=5.022950062993914e-06
         DEBUG:root:i=537 residual=4.968299890606431e-06
         DEBUG:root:i=538 residual=4.915227691526525e-06
         DEBUG:root:i=539 residual=4.861195520788897e-06
         DEBUG:root:i=540 residual=4.808141511603026e-06
         DEBUG:root:i=541 residual=4.7565940803906415e-06
         DEBUG:root:i=542 residual=4.706119398178998e-06
         DEBUG:root:i=543 residual=4.654755230149021e-06
         DEBUG:root:i=544 residual=4.603833531291457e-06
         DEBUG:root:i=545 residual=4.554462975647766e-06
         DEBUG:root:i=546 residual=4.504268417804269e-06
         DEBUG:root:i=547 residual=4.455523594515398e-06
         DEBUG:root:i=548 residual=4.4066091504646465e-06
         DEBUG:root:i=549 residual=4.359340437076753e-06
         DEBUG:root:i=550 residual=4.311761585995555e-06
         DEBUG:root:i=551 residual=4.2650972318369895e-06
         DEBUG:root:i=552 residual=4.219451966491761e-06
         DEBUG:root:i=553 residual=4.1730108932824805e-06
         DEBUG:root:i=554 residual=4.132204139750684e-06
         DEBUG:root:i=555 residual=4.088300102011999e-06
         DEBUG:root:i=556 residual=4.041223746753531e-06
         DEBUG:root:i=557 residual=3.996011400886346e-06
         DEBUG:root:i=558 residual=3.953571194870165e-06
         DEBUG:root:i=559 residual=3.91051344195148e-06
         DEBUG:root:i=560 residual=3.867212399200071e-06
         DEBUG:root:i=561 residual=3.826037755061407e-06
         DEBUG:root:i=562 residual=3.784243062909809e-06
         DEBUG:root:i=563 residual=3.7427885217766743e-06
         DEBUG:root:i=564 residual=3.703125003085006e-06
         DEBUG:root:i=565 residual=3.6617802834371105e-06
         DEBUG:root:i=566 residual=3.621431233113981e-06
         DEBUG:root:i=567 residual=3.582340468710754e-06
         DEBUG:root:i=568 residual=3.544490937201772e-06
         DEBUG:root:i=569 residual=3.506020220811479e-06
         DEBUG:root:i=570 residual=3.4678121210163226e-06
         DEBUG:root:i=571 residual=3.4314482491026865e-06
         DEBUG:root:i=572 residual=3.394102350284811e-06
         DEBUG:root:i=573 residual=3.356242586960434e-06
         DEBUG:root:i=574 residual=3.319063580420334e-06
         DEBUG:root:i=575 residual=3.2841014672158053e-06
         DEBUG:root:i=576 residual=3.2497921438334743e-06
         DEBUG:root:i=577 residual=3.213293894077651e-06
         DEBUG:root:i=578 residual=3.1783024496689904e-06
         DEBUG:root:i=579 residual=3.1435986329597654e-06
         DEBUG:root:i=580 residual=3.1102076718525495e-06
         DEBUG:root:i=581 residual=3.084524450969184e-06
         DEBUG:root:i=582 residual=3.042908247152809e-06
         DEBUG:root:i=583 residual=3.0106295980658615e-06
         DEBUG:root:i=584 residual=2.976075847982429e-06
         DEBUG:root:i=585 residual=2.946338327092235e-06
         DEBUG:root:i=586 residual=2.9123448257450946e-06
         DEBUG:root:i=587 residual=2.8825286335631972e-06
         DEBUG:root:i=588 residual=2.8513522920547985e-06
         DEBUG:root:i=589 residual=2.820150939442101e-06
         DEBUG:root:i=590 residual=2.7901116936845938e-06
         DEBUG:root:i=591 residual=2.7595501705945935e-06
         DEBUG:root:i=592 residual=2.728439085331047e-06
         DEBUG:root:i=593 residual=2.700315462789149e-06
         DEBUG:root:i=594 residual=2.6718678327597445e-06
         DEBUG:root:i=595 residual=2.641698074512533e-06
         DEBUG:root:i=596 residual=2.612714752103784e-06
         DEBUG:root:i=597 residual=2.584586354714702e-06
         DEBUG:root:i=598 residual=2.557664174673846e-06
         DEBUG:root:i=599 residual=2.5287010885222116e-06
         DEBUG:root:i=600 residual=2.501485141692683e-06
         DEBUG:root:i=601 residual=2.474821485520806e-06
         DEBUG:root:i=602 residual=2.4470089101669146e-06
         DEBUG:root:i=603 residual=2.4214275526901474e-06
         DEBUG:root:i=604 residual=2.394443981756922e-06
         DEBUG:root:i=605 residual=2.3691122805757914e-06
         DEBUG:root:i=606 residual=2.3438799416908296e-06
         DEBUG:root:i=607 residual=2.3192687876871787e-06
         DEBUG:root:i=608 residual=2.2960302885621786e-06
         DEBUG:root:i=609 residual=2.268468961119652e-06
         DEBUG:root:i=610 residual=2.247153815915226e-06
         DEBUG:root:i=611 residual=2.2197666567080887e-06
         DEBUG:root:i=612 residual=2.1947639652353246e-06
         DEBUG:root:i=613 residual=2.171784217352979e-06
         DEBUG:root:i=614 residual=2.1481660041899886e-06
         DEBUG:root:i=615 residual=2.1238256522337906e-06
         DEBUG:root:i=616 residual=2.1017035578552168e-06
         DEBUG:root:i=617 residual=2.080130343529163e-06
         DEBUG:root:i=618 residual=2.0558293272188166e-06
         DEBUG:root:i=619 residual=2.035118086496368e-06
         DEBUG:root:i=620 residual=2.011685637626215e-06
         DEBUG:root:i=621 residual=1.9920960312447278e-06
         DEBUG:root:i=622 residual=1.968408923858078e-06
         DEBUG:root:i=623 residual=1.9484482436382677e-06
         DEBUG:root:i=624 residual=1.927034645632375e-06
         DEBUG:root:i=625 residual=1.903354245769151e-06
         DEBUG:root:i=626 residual=1.8848928675652132e-06
         DEBUG:root:i=627 residual=1.8648157720235758e-06
         DEBUG:root:i=628 residual=1.8447710772306891e-06
         DEBUG:root:i=629 residual=1.8243389376948471e-06
         DEBUG:root:i=630 residual=1.8042399005935295e-06
         DEBUG:root:i=631 residual=1.7903255411511054e-06
         DEBUG:root:i=632 residual=1.7672318790573627e-06
         DEBUG:root:i=633 residual=1.7500555031801923e-06
         DEBUG:root:i=634 residual=1.7308096857959754e-06
         DEBUG:root:i=635 residual=1.7128412537203985e-06
         DEBUG:root:i=636 residual=1.6909808664422599e-06
         DEBUG:root:i=637 residual=1.6738881640776526e-06
         DEBUG:root:i=638 residual=1.6557966091568233e-06
         DEBUG:root:i=639 residual=1.6362808992198552e-06
         DEBUG:root:i=640 residual=1.6179930071302806e-06
         DEBUG:root:i=641 residual=1.6014919310691766e-06
         DEBUG:root:i=642 residual=1.5859955055930186e-06
         DEBUG:root:i=643 residual=1.5672285371692851e-06
         DEBUG:root:i=644 residual=1.550790784676792e-06
         DEBUG:root:i=645 residual=1.5377654563053511e-06
         DEBUG:root:i=646 residual=1.5260876580214244e-06
         DEBUG:root:i=647 residual=1.503594603491365e-06
         DEBUG:root:i=648 residual=1.4831118733127369e-06
         DEBUG:root:i=649 residual=1.468720483899233e-06
         DEBUG:root:i=650 residual=1.4564999446520233e-06
         DEBUG:root:i=651 residual=1.4379874073711107e-06
         DEBUG:root:i=652 residual=1.4211528878149693e-06
         DEBUG:root:i=653 residual=1.4076056231715484e-06
         DEBUG:root:i=654 residual=1.390565671499644e-06
         DEBUG:root:i=655 residual=1.3766396023129346e-06
         DEBUG:root:i=656 residual=1.362696025353216e-06
         DEBUG:root:i=657 residual=1.3466877817336353e-06
         DEBUG:root:i=658 residual=1.3309678479345166e-06
         DEBUG:root:i=659 residual=1.3159411764718243e-06
         DEBUG:root:i=660 residual=1.3018988056501257e-06
         DEBUG:root:i=661 residual=1.2877704875791096e-06
         DEBUG:root:i=662 residual=1.2746759239234962e-06
         DEBUG:root:i=663 residual=1.2621732139450614e-06
         DEBUG:root:i=664 residual=1.2526697901193984e-06
         DEBUG:root:i=665 residual=1.2328425782470731e-06
         DEBUG:root:i=666 residual=1.223842332365166e-06
         DEBUG:root:i=667 residual=1.2059093705829582e-06
         DEBUG:root:i=668 residual=1.1950755833822768e-06
         DEBUG:root:i=669 residual=1.183338667942735e-06
         DEBUG:root:i=670 residual=1.1703110658345395e-06
         DEBUG:root:i=671 residual=1.1573677056730958e-06
         DEBUG:root:i=672 residual=1.1454495734142256e-06
         DEBUG:root:i=673 residual=1.131956082645047e-06
         DEBUG:root:i=674 residual=1.1185400126123568e-06
         DEBUG:root:i=675 residual=1.107506250264123e-06
         DEBUG:root:i=676 residual=1.098294433177216e-06
         DEBUG:root:i=677 residual=1.0824064702319447e-06
         DEBUG:root:i=678 residual=1.0730774420153466e-06
         DEBUG:root:i=679 residual=1.0612884580041282e-06
         DEBUG:root:i=680 residual=1.050175683303678e-06
         DEBUG:root:i=681 residual=1.03606441825832e-06
         DEBUG:root:i=682 residual=1.0244245913781924e-06
         DEBUG:root:i=683 residual=1.014870690596581e-06
         DEBUG:root:i=684 residual=1.0036894764198223e-06
         DEBUG:root:i=685 residual=9.915290775097674e-07
         INFO:root:rank=0 pagerank=7.0149e-01 url=www.lawfareblog.com/covid-19-speech-and-surveillance-response
         INFO:root:rank=1 pagerank=7.0149e-01 url=www.lawfareblog.com/lawfare-live-covid-19-speech-and-surveillance
         INFO:root:rank=2 pagerank=1.0552e-01 url=www.lawfareblog.com/cost-using-zero-days
         INFO:root:rank=3 pagerank=3.1757e-02 url=www.lawfareblog.com/lawfare-podcast-former-congressman-brian-baird-and-daniel-schuman-how-congress-can-continue-function
         INFO:root:rank=4 pagerank=2.2040e-02 url=www.lawfareblog.com/events
         INFO:root:rank=5 pagerank=1.6027e-02 url=www.lawfareblog.com/water-wars-increased-us-focus-indo-pacific
         INFO:root:rank=6 pagerank=1.6026e-02 url=www.lawfareblog.com/water-wars-drill-maybe-drill
         INFO:root:rank=7 pagerank=1.6023e-02 url=www.lawfareblog.com/water-wars-disjointed-operations-south-china-sea
         INFO:root:rank=8 pagerank=1.6020e-02 url=www.lawfareblog.com/water-wars-song-oil-and-fire
         INFO:root:rank=9 pagerank=1.6020e-02 url=www.lawfareblog.com/water-wars-sinking-feeling-philippine-china-relations
    
    

    Task 2, part 1:

    $ python3 pagerank.py --data=data/lawfareblog.csv.gz --filter_ratio=0.2 --personalization_vector_query='corona'
    
     INFO:root:rank=0 pagerank=6.3127e-01 url=www.lawfareblog.com/covid-19-speech-and-surveillance-response
     INFO:root:rank=1 pagerank=6.3124e-01 url=www.lawfareblog.com/lawfare-live-covid-19-speech-and-surveillance
     INFO:root:rank=2 pagerank=1.5947e-01 url=www.lawfareblog.com/chinatalk-how-party-takes-its-propaganda-global
     INFO:root:rank=3 pagerank=1.2209e-01 url=www.lawfareblog.com/brexit-not-immune-coronavirus
     INFO:root:rank=4 pagerank=1.2209e-01 url=www.lawfareblog.com/rational-security-my-corona-edition
     INFO:root:rank=5 pagerank=9.3360e-02 url=www.lawfareblog.com/trump-cant-reopen-country-over-state-objections
     INFO:root:rank=6 pagerank=9.1920e-02 url=www.lawfareblog.com/prosecuting-purposeful-coronavirus-exposure-terrorism
     INFO:root:rank=7 pagerank=9.1920e-02 url=www.lawfareblog.com/britains-coronavirus-response
     INFO:root:rank=8 pagerank=7.7770e-02 url=www.lawfareblog.com/lawfare-podcast-united-nations-and-coronavirus-crisis
     INFO:root:rank=9 pagerank=7.2888e-02 url=www.lawfareblog.com/house-oversight-committee-holds-day-two-hearing-government-coronavirus-response
    

    Task 2, part 2:

    $ python3 pagerank.py --data=data/lawfareblog.csv.gz --filter_ratio=0.2 --personalization_vector_query='corona' --search_query='-corona'
    
     INFO:root:rank=0 pagerank=6.3127e-01 url=www.lawfareblog.com/covid-19-speech-and-surveillance-response
     INFO:root:rank=1 pagerank=6.3124e-01 url=www.lawfareblog.com/lawfare-live-covid-19-speech-and-surveillance
     INFO:root:rank=2 pagerank=1.5947e-01 url=www.lawfareblog.com/chinatalk-how-party-takes-its-propaganda-global
     INFO:root:rank=3 pagerank=9.3360e-02 url=www.lawfareblog.com/trump-cant-reopen-country-over-state-objections
     INFO:root:rank=4 pagerank=7.0277e-02 url=www.lawfareblog.com/fault-lines-foreign-policy-quarantined
     INFO:root:rank=5 pagerank=6.9713e-02 url=www.lawfareblog.com/lawfare-podcast-mom-and-dad-talk-clinical-trials-pandemic
     INFO:root:rank=6 pagerank=6.4944e-02 url=www.lawfareblog.com/limits-world-health-organization
     INFO:root:rank=7 pagerank=5.9492e-02 url=www.lawfareblog.com/chinatalk-dispatches-shanghai-beijing-and-hong-kong
     INFO:root:rank=8 pagerank=5.1245e-02 url=www.lawfareblog.com/us-moves-dismiss-case-against-company-linked-ira-troll-farm
     INFO:root:rank=9 pagerank=5.1245e-02 url=www.lawfareblog.com/livestream-house-armed-services-holds-hearing-national-security-challenges-north-and-south-america
    
    
  3. Ensure that all your changes to the pagerank.py and README.md files are committed to your repo and pushed to github.

  4. Get at least 5 stars on your repo. (You made trade stars with other students in the class.)

    NOTE:

    Recruiters use github profiles to determine who to hire, and pagerank is used to rank user profiles and projects. Links in this graph correspond to who has starred/followed who's repo. By getting more stars on your repo, you'll be increasing your github pagerank, which increases the likelihood that recruiters will hire you. To see an example, perform a search for data mining. Notice that the results are returned "approximately" ranked by the number of stars, but because "some stars count more than others" the results are not exactly ranked by the number of stars. (I asked you not to fork this repo because forks are ranked lower than non-forks.)

    In some sense, we are doing a "dual problem" to data mining by getting these stars. Recruiters are using data mining to find out who the best people to recruit are, and we are hacking their data mining algorithms by making those algorithms select you instead of someone else.

    If you're interested in exploring this idea further, here's a python tutorial for extracting GitHub's social graph: https://www.oreilly.com/library/view/mining-the-social/9781449368180/ch07.html ; if you're interested in learning more about how recruiters use github profiles, read this Hacker News post: https://news.ycombinator.com/item?id=19413348.

  5. Submit the url of your repo to sakai.

    Each part is worth 2 points, for 12 points overall.