Skip to content

A simple script to convert DMOZ content.rdf.u8.gz into a CSV file.

Notifications You must be signed in to change notification settings

jainprateek/dmoz_csv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dmoz_csv

A python script to convert DMOZ content.rdf.u8.gz into a CSV file. It also includes the output CSV file generated from it.

The structure of the file is

"URL","Category 1","Category 2",..........

Example:

http://www.demus.it/

is in

DMOZ Categories (1-4 of 4) Business: Food and Related Products: Beverages: Coffee (1) Regional: Europe: Italy: Regions: Friuli-Venezia Giulia: Localities: Trieste: Business and Economy (1) World: Italiano: Affari: Alimentazione e Prodotti Correlati: Bevande: Caffè (1) World: Italiano: Regionale: Europa: Italia: Friuli-Venezia Giulia: Provincia di Trieste: Località: Trieste: Affari e Economia (1)

The corresponding line for it will be generated as:

"http://www.demus.it/","Top/Regional/Europe/Italy/Friuli-Venezia_Giulia/Localities/Trieste/Business_and_Economy","Top/World/Italiano/Affari/Alimentazione_e_Prodotti_Correlati/Bevande/Caffè","Top/World/Italiano/Regionale/Europa/Italia/Friuli-Venezia_Giulia/Provincia_di_Trieste/Località/Trieste/Affari_e_Economia","Top/Business/Food_and_Related_Products/Beverages/Coffee"

About

A simple script to convert DMOZ content.rdf.u8.gz into a CSV file.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages