ImpactU - Detalle del Producto

Indexed NLP Article Metadata Dataset

Acceso Abierto

Idioma: No disponible

Publicado: 01/01/2023

APC (est): No disponible

PDF

JSON

HTML

Abstract:

his dataset consists of a curated collection of published, indexed articles (N=75527) related to Natural Language Processing (NLP) collected from Web Of Science, along with a classification into one of five categories depending on the approach to NLP used. Category 4: The abstract does not mention a particular model or technique. Some papers analyzing frameworks, surveys, papers centered the computer vision component of NLP and dataset proposals among others fall into this category. Category 0 (Rule-Based): A model based on rules or symbolic analysis is used. Category 1 (Statistical Methods): An approach using statistical methods is used. This includes BoWs, N-Grams, TF-IDF, along with other machine learning techniques like SVMs, Logistic Regression, LDA and others. Shallow neural network models like word2vec also belong in this category. Category 2 (Deep Learning): Approaches that use Deep Learning and other Deep Neural Network architectures such as RNNs, CNNs and LSTM are included in this category. Category 3 (Transformer Models): The approach proposed uses transformer based models, like BERT, GPT, T5 and others. It is to note that the classification could be imprecise, is not strictly defined and should be used only as a starting point. Fields: 'Authors', 'Article Title', 'Volume', 'Issue', 'Special Issue', 'Start Page', 'End Page', 'DOI', 'Book DOI', 'Publication Date', 'Times Cited', 'ISSN', 'eISSN', 'Author Full Names', 'Book Author Full Names', 'Language', 'Author Keywords', 'Keywords', 'Funding Orgs', 'Funding Text', 'Cited References', 'DOI Link', 'Number of Pages', 'Categories', 'Research Areas', 'bert_preds', 'setfit_preds', 'knn_preds', 'abstract_hash'. The dataset is provided in different formats. To address potential copyright, licensing, and data privacy concerns, we have replaced the original abstracts with SHA-256 hashes, cryptographic representations of the abstracts' content. Please note that the copyright and licensing status of the original articles may vary, and users should respect any applicable terms and restrictions associated with the source publications.

Tópico:

No hay tópicos disponibles

Citaciones:

Citaciones por año:

No hay datos de citaciones disponibles

Altmétricas:

Información de la Fuente:

FuenteHarvard Dataverse	Cuartil año de publicaciónNo disponible	VolumenNo disponible
IssueNo disponible	PáginasNo disponible	pISSNNo disponible
ISSNNo disponible	Perfil OpenAlexhttps://openalex.org/S4377196806

Enlaces e Identificadores:

Openalex URL	https://openalex.org/W4398734528	Doi URL	https://doi.org/10.7910/dvn/5yigng	Open_access URL	https://dataverse.harvard.edu/citation?persistentId=doi:10.7910/DVN/5YIGNG

Conjunto de datos