-
-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New courts #74
base: main
Are you sure you want to change the base?
New courts #74
Conversation
Replace \\N{EM DASH} with em dash symbol Update test for em dash in court name
I'm going to duck out of this one. All yours for review, @flooie. |
"Superior Court Judicial District of Middlesex at Middletown", | ||
"Superior Court Judicial District of Ansonia-Milford at Milford", | ||
"Superior Court Judicial District of Ansonia at Milford", | ||
"Court of Common Pleas Hartford County" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be something else I think.
"Connecticut Compensation Review Board", | ||
"Workers' Compensation Commission" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how sure are we that this is the same court?
"Court of Quarter Sessions.* Delaware", | ||
"Court of General Sessions New Castle Delaware", | ||
"Court of General Sessions of The Peace And Jail Delivery of Delaware Kent County", | ||
"Court of General Sessions of Delaware Kent County" | ||
"Court of General Sessions of Delaware Kent County", | ||
"Courts of General Sessions and Oyer and Terminer of Delaware" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are probably distinct courts.
@@ -3323,7 +3383,8 @@ | |||
"name_abbreviation": null, | |||
"regex": [ | |||
"Florida ${sup}", | |||
"${sup} Florida" | |||
"${sup} Florida", | |||
"Court of Florida, Division B" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is likely not the Supreme Court. Im not sure what division B is but we probably need to identify what it is
"Appellate Court of Illinois Fifth District Industrial Commission ?(Division)?" | ||
"Appellate Court of Illinois,? (First|Second|Third|Fourth|Fifth) District(\\.|,)? Industrial Commission Division", | ||
"Appellate Court of Illinois,? (First|Second|Third|Fourth|Fifth) District(\\.|,)? Industrial Commission", | ||
"Appellate Court of Illinois,? (First|Second|Third|Fourth|Fifth) District(\\.|,)? Workers' Compensation Commission Division" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im not sure these are the same thing - they could be - but I think this requires more research
"Superior Court of Massachusetts" | ||
"Superior Court of Massachusetts", | ||
"Commonwealth of Massachusetts Superior Court", | ||
"Commonwealth of Massachusetts Superior Court,? (MIDDLESEX|Middlesex)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually need both of these regex patterns here for middlesex. I assume we would have some normalization of capitalization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also - I assume that we have a supreior court breakdown here.
"Supreme Court of New Hampshire(\\.|,)? (Auburn|Berlin|Claremont|Colebrook|Concord|Derry|Dover|Durham|Exeter|Franklin|Goffstown|Gorham|Greenville|Hampton|Hanover|Haverhill|Henniker|Hillsborough|Hooksett|Jaffrey|Keene|Laconia|Lancaster|Lebanon|Manchester|Merrimack|Milford|Nashua|Newport|New London|Ossipee|Peterborough|Pittsfield|Plaistow|Plymouth|Portsmouth|Rochester|Salem|U.S.|Wilton|) District Court", | ||
"Supreme Court of New Hampshire(\\.|,)? (Amherst|Concord|Conway|Dover|Durham|Epping|Exeter|Farmington|Greenville|Hampton|Jaffrey|Keene|Laconia|Lebanon|Manchester|Milford|Nashua|Pelham|Peterborough|Pittsfield|Portsmouth|Rye|Wilton|Wolfeboro) Municipal Court", | ||
"Supreme Court of New Hampshire(\\.|,)? Municipal Court of (Amherst|Berlin|Charlestown|Concord|Conway|Dover|Exeter|Farmington|Greenville|Hampton|Jaffrey|Keene|Laconia|Lebanon|Manchester|Milford|Nashua|Portsmouth|Rye|Wilton|Wolfeboro)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should probably be broken up into its child courts- and parent courts - now that we have a mechanism for it.
"New Jersey Superior Court,? Appellate Division" | ||
"New Jersey Superior Court,? Appellate Division", | ||
"Superior Court of New Jersey, Appellate Division", | ||
"Prerogative Court", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im not sure what the prerogative court is but I think its probably not this?
"Courts of Errors And Appeals of New Jersey", | ||
"Court of Errors and Appeals", | ||
"Court of Errors and Appeals. Court of Chancery and Prerogative Court", | ||
"Court of Chancery and Prerogative Court. Court of Errors and Appeals" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this one court or multiple ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also the court of chancery is down below this
"New Jersey Superior Court, Law Division" | ||
"New Jersey Superior Court, Law Division", | ||
"Superior Court of New Jersey, Law Division", | ||
"Superior Court of New Jersey, Chancery Division" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again here is chancery again?
@@ -29023,7 +29285,7 @@ | |||
"Domestic Relations Court of The City of New York Family Court Division( Queens County)?", | |||
"Domestic Relations Court of The City of New York Children S Court Division( ${bw} County)?", | |||
"Fam Ct Bronx County", | |||
"Fam(ily)? C(our)?t (Tompkins|Livingston|Chautauqua|Bronx) County", | |||
"Fam(ily)? C(our)?t(\\.|,)? (Albany|Allegany|Bronx|Broome|Cayuga|Chautauqua|Chemung|Clinton|Dutchess|Erie|Essex|Fulton|Genesee|Jefferson|Kings|Lewis|Livingston|Madison|Monroe|Montgomery|Nassau|Niagara|Oneida|Onondaga|Ontario|Orange|Oswego|Putnam|Queens|Rensselaer|Richmond|Rockland|Saratoga|Schenectady|Schoharie|Schuyler|Seneca|St. Lawrence|Steuben|Suffolk|Tompkins|Ulster|Warren|Washington|Wayne|Yates) County", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in St. Lawrence needs to be \.
"Surrogate'?s? Court(\\.|,)? (Albany|Allegany|Broome|Bronx|Cattaraugus|Cayuga|Chautauqua|Chemung|Chenango|Clinton|Columbia|Cortland|Delaware|Dutchess|Erie|Essex|Franklin|Fulton|Genesee|Greene|Hamilton|Herkimer|Jefferson|King|Kings|Lewis|Livingston|Madison|Montgomery|Monroe|Nassau|Niagara|New York|Orleans|Oneida|Onondaga|Ontario|Orange|Otsego|Oswego|Putnam|Queens|Rennsselaer|Rensselaer|Richmond|Rockland|Saratoga|Schenectady|Schoharie|Schuyler|Seneca|St. Lawrence|Steuben|Suffolk|Sullivan|Tioga|Tompkins|Ulster|Warren|Washington|Wayne|Westchester|Wyoming|Yates) County", | ||
"New York Surrogate's Court, \b.*\b County", | ||
"Surrogate's Court", | ||
"Surrogates' Court, New York County", | ||
"Surrogate'? Court, (Allegany|Cattaraugus|Kings|Saratoga|Westchester) County", | ||
"Surrogate's Court of the City of New York(\\.|,)? (Albany|Bronx|Chautauqua|Dutchess|Erie|Kings|Nassau|Monroe|New York|Queens|Richmond|Rockland|Schenectady|Schoharie|Schuyler|Seneca|St. Lawrence|Steuben|Suffolk|Westchester|Wyoming) County", | ||
"Surrogate's Court of the State of New York, Nassau County" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would be cleaner if it was just broken up into each county - surrogate courts. I think.
so parent Ny surrogate court and then each subdivision no?
@@ -29498,7 +29778,8 @@ | |||
"regex": [ | |||
"New York Court of Claims", | |||
"Court of Claims? of New York", | |||
"New York Claims Court" | |||
"New York Claims Court", | |||
"Court of Claims" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while this is probably how its stated - im not sure we can do court of claims here - its a bit ... tough because the court of claims is also a federal court.
"Superior Court for Law and Equity, (Carthage|Hamilton|Jonesborough|Knoxville|Mero|Nashville|Robertson|Washington|Winchester) District", | ||
"Superior Court for Law and Equity, District of (Carthage|Hamilton|Knoxville|Mero|Nashville|Robertson|Washington|Winchester)", | ||
"Circuit Court of the United States, Nashville", | ||
"Supreme Court of (Clarksville|Knoxville|Nashville|Sparta)", | ||
"Superior Court for Law and Equity" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we sure these are all the same- can we double check these
@@ -54896,7 +56758,8 @@ | |||
"location": "England", | |||
"name": "Court of King's Bench", | |||
"regex": [ | |||
"Court of Kings Bench" | |||
"Court of Kings Bench", | |||
"Court of King's Bench Latch's Reports" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not the court name but the reporter It seems
@@ -59960,7 +61823,8 @@ | |||
"regex": [ | |||
"Circuit Court D Tenn", | |||
"Circuit Court Tennessee", | |||
"United States Circuit Court for the District of Tennessee" | |||
"United States Circuit Court for the District of Tennessee", | |||
"Federal Court, Nashville" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is weird its more like a description - do we have this in the dataset?
@@ -105,7 +105,8 @@ | |||
"regex": [ | |||
"${coa} Alabama", | |||
"State of Alabama ${coa}", | |||
"Alabama Court of Appeals" | |||
"Alabama Court of Appeals", | |||
"Court of Appeals of Alabama" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ${coa} Alabama
is equivalent to "Court of Appeals of Alabama"
@@ -24330,7 +24519,8 @@ | |||
"State of New Jersey State Board of Taxes And Assessment", | |||
"New Jersey Division of Tax Appeals", | |||
"New Jersey State Board of Taxes and Assessment", | |||
"New Jersey Department of Taxation and Finance" | |||
"New Jersey Department of Taxation and Finance", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is different than the tax board I think.. can we see if all these are different parts of the process
"New York Police Justice's Court", | ||
"New York Village Court, Nassau County" | ||
"New York Village Court, Nassau County", | ||
"Justice Court", | ||
"Village Court, Nassau County", | ||
"Justice Court of the City of New York" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
didnt we break apart the justice courts into each sub court. if so we should put Nassau and teh city of New York with its specific lower division
@@ -25861,7 +26061,8 @@ | |||
"regex": [ | |||
"County Court of New York", | |||
"New York County Court", | |||
"County Court of New York General Sessions" | |||
"County Court of New York General Sessions", | |||
"County Court" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this feels not specific enough ... it will cause all sorts of confusion across all the states every time county. court comes up and we dont match.
"New York Surrogate's Court, \b.*\b County" | ||
"Surrogate'?s? Court(\\.|,)? (Albany|Allegany|Broome|Bronx|Cattaraugus|Cayuga|Chautauqua|Chemung|Chenango|Clinton|Columbia|Cortland|Delaware|Dutchess|Erie|Essex|Franklin|Fulton|Genesee|Greene|Hamilton|Herkimer|Jefferson|King|Kings|Lewis|Livingston|Madison|Montgomery|Monroe|Nassau|Niagara|New York|Orleans|Oneida|Onondaga|Ontario|Orange|Otsego|Oswego|Putnam|Queens|Rennsselaer|Rensselaer|Richmond|Rockland|Saratoga|Schenectady|Schoharie|Schuyler|Seneca|St. Lawrence|Steuben|Suffolk|Sullivan|Tioga|Tompkins|Ulster|Warren|Washington|Wayne|Westchester|Wyoming|Yates) County", | ||
"New York Surrogate's Court, \b.*\b County", | ||
"Surrogate's Court", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also - I think too wide open and chances for confusion
"Surrogate's Court", | ||
"Surrogates' Court, New York County", | ||
"Surrogate'? Court, (Allegany|Cattaraugus|Kings|Saratoga|Westchester) County", | ||
"Surrogate's Court of the City of New York(\\.|,)? (Albany|Bronx|Chautauqua|Dutchess|Erie|Kings|Nassau|Monroe|New York|Queens|Richmond|Rockland|Schenectady|Schoharie|Schuyler|Seneca|St. Lawrence|Steuben|Suffolk|Westchester|Wyoming) County", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to escape the St\. I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@quevon24 thanks this is a lot of work but I have some questions about some of these changes.
Thrilled to see movement here. Why did we do this though? Like, what project or effort is dependent on this? |
In the old importer we used many strategies to try to find out the court like using a regex or folder name (https://github.com/freelawproject/courtlistener/blob/main/cl/corpus_importer/import_columbia/regexes_columbia.py), to update the importer i thought that the best approach would be to have that in courts-db. I also implemented some courts that were manually entered into courtlistener but were not in courts-db. |
The new courts in this PR were already added at the end of last year and beginning of this year, also some of the new variations in this PR were added in a different PR, maybe some of them were taken from this PR. See merged PRs from Dec 29, 2023 to Jan 5, 2024. From this PR will probably only need to add a few more variations, but to do so it would be most convenient to do it in a separate PR since resolving the conflicts that exist will be very difficult due to the large number of changes that were made in this PR, the diff with main branch is huge. |
New courts and courts variations obtained from Columbia data
Many of the new courts are from New york, here is a list of some of the new courts added:
Some of these courts were entered in courtlistener(https://www.courtlistener.com/help/api/jurisdictions/) but not in courts-db.