How to extract organization names? #916
Answered
by
spencermountain
tomdavidson
asked this question in
Q&A
-
How do I extract organization names? I seem to be missing something. const exp = [
"Adt Security Systems Inc\n1501 Yamato Rd.\nBoca Raton, FL 33431",
"Aldous\nPo Box 171374\nHolladay, UT 84117",
"AT&T\nAttn Bankruptcy Department\n5407 Andrews Highway\nMidland, TX 79706",
"Axcess Financial\n7755 Montogomery Rd\nSuite 400\nCincinnati, OH 45236",
"Capital One\nAttn: General Correspondence/Bankruptcy\nPo Box 30285\nSalt Lake City, UT 84130",
"Capital One Bank (USA), N.A.\nby American InfoSource as agent\nPO Box 71083\nCharlotte, NC 28272-1083",
"Speedy/Rapid Cash\nP.O. Box 780408\nWichita, KS 67278",
"UAC/Carhop\nPO Box 398104\nEdina, MN 55439",
"Utah Community Credit Union\n188 River Park Drive\nProvo, UT 84604-5648",
"Utah State Tax Commission\nTaxpayer Services Division\n210 North 1950 West\nSalt Lake City, UT 84134",
"Verizon\nby American InfoSource\nPO Box 248838\nOklahoma City, OK 73126-8838",
"America First Credit Union\n1344 W 4675 S\nOgden, UT 84405",
"America First Credit Union\nPo Box 9199\nOgden, UT 84409-0199",
"America First Credit Union\nAttn: John Lund, President & CEO\n1344 W 4675 S\nOgden, UT 84405",
"American Coradius International LLC\n2420 Sweet Home Rd Ste 150\nBuffalo, NY 14228",
"American Express\nPo Box 981535\nEl Paso, TX 79998-1535",
"American Express\nPo Box 650448\nDallas, TX 75265-0448",
"American Express\nPo Box 297871\nFort Lauderdale, FL 33329",
"American Express\nPo Box 981537\nEl Paso, TX 79998",
"American Express\nCorrespondence/Bankruptcy\nPo Box 981540\nEl Paso, TX 79998",
"American Express\nPo Box 6985\nBuffalo, NY 14240-6985",
"Amex\nCorrespondence/Bankruptcy\nPo Box 981540\nEl Paso, TX 79998",
"Amex\nPo Box 981537\nEl Paso, TX 79998",
"Capital One\nPo Box 60599\nCity of Industry, CA 91716-0599",
].map((s) => nlp(s));
console.table(exp.map((doc) => doc.organizations().text()));
console.table(exp.map((doc: Three) => doc.places().text()));
console.table(exp.map((doc: Three) => doc.topics().text())); yields this output: ┌─────────┬─────────────────────────────────────────────────────────┐
│ (index) │ Values │
├─────────┼─────────────────────────────────────────────────────────┤
│ 0 │ 'Adt Security Systems Inc' │
│ 1 │ '' │
│ 2 │ 'AT&T\nAttn Bankruptcy Department' │
│ 3 │ 'Axcess Financial' │
│ 4 │ 'Capital One' │
│ 5 │ 'Capital One Bank (USA),' │
│ 6 │ '' │
│ 7 │ '' │
│ 8 │ 'Utah Community Credit Union' │
│ 9 │ 'Utah State Tax Commission\nTaxpayer Services Division' │
│ 10 │ 'Verizon' │
│ 11 │ 'America First Credit Union' │
│ 12 │ 'America First Credit Union' │
│ 13 │ 'America First Credit Union\nPresident & CEO' │
│ 14 │ 'American Coradius International LLC' │
│ 15 │ 'American Express' │
│ 16 │ 'American Express' │
│ 17 │ 'American Express' │
│ 18 │ 'American Express' │
│ 19 │ 'American Express' │
│ 20 │ 'American Express' │
│ 21 │ '' │
│ 22 │ '' │
│ 23 │ 'Capital One' │
└─────────┴─────────────────────────────────────────────────────────┘
┌─────────┬───────────────────────┐
│ (index) │ Values │
├─────────┼───────────────────────┤
│ 0 │ '1501 Yamato Rd.\nFL' │
│ 1 │ 'UT' │
│ 2 │ '' │
│ 3 │ '7755 Montogomery Rd' │
│ 4 │ 'UT' │
│ 5 │ '' │
│ 6 │ 'Wichita,' │
│ 7 │ '' │
│ 8 │ 'Utah UT' │
│ 9 │ 'Utah UT' │
│ 10 │ 'Oklahoma' │
│ 11 │ 'America UT' │
│ 12 │ 'America UT' │
│ 13 │ 'America UT' │
│ 14 │ 'Rd' │
│ 15 │ 'El Paso,' │
│ 16 │ 'Dallas,' │
│ 17 │ 'FL' │
│ 18 │ 'El Paso,' │
│ 19 │ 'El Paso,' │
│ 20 │ '' │
│ 21 │ 'El Paso,' │
│ 22 │ 'El Paso,' │
│ 23 │ '' │
└─────────┴───────────────────────┘
┌─────────┬─────────────────────────────────────────────────────────────────────┐
│ (index) │ Values │
├─────────┼─────────────────────────────────────────────────────────────────────┤
│ 0 │ 'Adt Security Systems Inc\nFL 1501 Yamato Rd.' │
│ 1 │ 'UT' │
│ 2 │ 'Attn Bankruptcy Department\nAT&T' │
│ 3 │ 'Axcess Financial\n7755 Montogomery Rd' │
│ 4 │ 'Capital One\nUT' │
│ 5 │ 'Capital One Bank (USA), Charlotte,' │
│ 6 │ 'Wichita,' │
│ 7 │ '' │
│ 8 │ 'Utah Community Credit Union\nUT Utah' │
│ 9 │ 'Taxpayer Services Division\nUtah State Tax Commission\nUT Utah' │
│ 10 │ 'Verizon\nOklahoma' │
│ 11 │ 'America First Credit Union\nUT America' │
│ 12 │ 'America First Credit Union\nUT America' │
│ 13 │ 'President & CEO\nAmerica First Credit Union\nUT America John Lund' │
│ 14 │ 'American Coradius International LLC\nRd' │
│ 15 │ 'American Express\nEl Paso,' │
│ 16 │ 'American Express\nDallas,' │
│ 17 │ 'American Express\nFL' │
│ 18 │ 'American Express\nEl Paso,' │
│ 19 │ 'American Express\nEl Paso,' │
│ 20 │ 'American Express' │
│ 21 │ 'El Paso,' │
│ 22 │ 'El Paso,' │
│ 23 │ 'Capital One' │
└─────────┴─────────────────────────────────────────────────────────────────────┘ |
Beta Was this translation helpful? Give feedback.
Answered by
spencermountain
Apr 29, 2022
Replies: 2 comments 2 replies
-
hi Tom, yeah that looks pretty good to me. let vocab={
'Aldous':'Organization'
}
nlp(text, vocab).organizations() i'm not sure about ambiguous terms like 'Amex\nCorrespondence/Bankruptcy'. if you wanted to be more aggressive than the results of |
Beta Was this translation helpful? Give feedback.
2 replies
Answer selected by
spencermountain
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
hi Tom, yeah that looks pretty good to me.
if you're expecting some properNouns to be organizations, you can add them like
i'm not sure about ambiguous terms like 'Amex\nCorrespondence/Bankruptcy'. if you wanted to be more aggressive than the results of
.organizations()
, you could whip up something using.match('#ProperNoun+')
cheers