-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inability to Conduct Detailed Analysis on the CR Column of the Bibliometrix Data Frame #228
Comments
@elexingyu thank you for this feedback! Could you give me a more specific example, please? For example, what would the CR column of the WoS dataframe for W3001118548 be? |
@trangdata Thank you for your response! Below are the data I obtained from the Web of Science W3001118548, which I then converted into a bibliographic data frame using the bibliometrix::convert2df function, followed by an examination of the results in the CR column. Based on my observations, the CR column typically includes the first author's name, the year of publication, the publication title and article location, plus the DOI. These details facilitate conducting Co-citation Network analysis regarding authors and sources using bibliometrix. You can also refer to the following webpage: A brief introduction to bibliometrix, which contains detailed information about the CR column in the section "Analysis of Cited References".
Below is the co-citation network analysis using the dataset from openalexR, where, unfortunately, it's impossible to display the authors' names and only the OpenAlex IDs are shown: Below is the co-citation network analysis using the WoS data, which successfully analyzed the author networks. However, it also has a drawback: it assigns all missing paper author information to 'anonymous.' OpenAlex should be able to avoid this issue due to its more comprehensive data: |
@elexingyu Thank you for the explanation. I will need @massimoaria's input since he's more familiar with the internals of bibliometrix. In the mean time, however, you can try manually modifying the CR column. For example: library(openalexR)
biblio_data <- oa_fetch(identifier = c("W3001118548", "W2015795623")) |>
oa2bibliometrix()
get_cr <- function(cr) {
r <- oa_fetch(identifier = strsplit(cr, ";")[[1]])
auths <- show_works(r, identity)[["first_author"]]
paste(auths, collapse = ";")
}
biblio_data$CR <- sapply(biblio_data$CR, get_cr)
str(biblio_data$CR)
#> chr [1:2] "Douglas G. Altman;Jaswinder Gill;Harriet G. Oldham;J. A. Tytler;G. L. Serfontein" ... Created on 2024-04-10 with reprex v2.0.2 From here, your co-citation network analysis should have first name authors instead of OpenAlex IDs. To get a CR column more similar to the output from library(openalexR)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
biblio_data <- oa_fetch(identifier = c("W3001118548", "W2015795623")) |>
oa2bibliometrix()
shorten_doi <- function(doi) {
gsub("^https://doi.org/", "", doi)
}
get_cr <- function(cr) {
r <- oa_fetch(
identifier = strsplit(cr, ";")[[1]],
options = list(select = c(
"authorships", "display_name", "publication_year",
"primary_location", "doi"
))
)
auths <- vapply(
r$author, openalexR:::get_auth_position, character(1),
position = "first"
)
r |>
mutate(
first_aut = auths,
doi = paste("DOI", shorten_doi(doi)),
o = paste(first_aut, publication_year, display_name, doi, so, sep = ", ")
) |>
pull(o) |>
paste(collapse = ";") |>
toupper()
}
biblio_data$CR <- sapply(biblio_data$CR, get_cr)
str(biblio_data$CR)
#> chr [1:2] "DOUGLAS G. ALTMAN, 1983, MEASUREMENT IN MEDICINE: THE ANALYSIS OF METHOD COMPARISON STUDIES, DOI 10.2307/298793"| __truncated__ ... Created on 2024-04-10 with reprex v2.0.2 |
@trangdata Fantastic! Thank you for your reply and your code! I am touched by your selfless spirit in answering questions for others! The code basically works now, and it can generate beautiful results for the author's Co-citation Network! However, since Co-citation Network analysis can be applied to papers, authors, and sources, it impacts the arrangement of content in the CR column, so I made some modifications to the code you generously provided:
Code:
Before the modification, the result of the Papers-Co-citation Network analysis: After the modification, the result of the Papers-Co-citation Network analysis: However, there is a minor issue now; when analyzing the source-Co-citation Network, an error occurs. I tried to analyze the original code of this function in bibliometrix, but it is quite complex, and difficult for a newbie in R like me. The error message is as follows:
|
Looks like you need the CR_SO column in your input dataframe to your source-co-citation network analysis. So essentially you need a similar function to |
@elexingyu Please give the smallest reproducible code example |
Here is my code for batch fetching all the paper metadata of a specific journal and then converting it into a bibliometrix data frame for analysis. However, I'm facing an issue where the CR column in the output bibliometrix data frame contains OpenAlex IDs, which prevents the analysis of cited references' authors. Are there any future updates planned to address this issue, enabling the CR column in the bibliometrix data frame obtained from OpenAlex to follow the same format as WoS data?
OpenAlexR generated data frame:
WoS data frame:
The text was updated successfully, but these errors were encountered: