Skip to content

Commit cd1cf2a

Browse files
authored
Update README.md
1 parent 795c8eb commit cd1cf2a

1 file changed

Lines changed: 0 additions & 2 deletions

File tree

README.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
11
# The Tokenizer for Clinical Cases Written in Spanish
22

3-
## Digital Object Identifier (DOI) and access to dataset files
4-
53

64
## Introduction
75
This repository contains the tokenization model trained using the SPACCC_TOKEN corpus (https://github.com/PlanTL-SANIDAD/SPACCC_TOKEN). The model was trained using the 90% of the corpus (900 clinical cases) and tested against the 10% (100 clinical cases). This model is a great resource to tokenize biomedical documents, specially clinical cases written in Spanish.

0 commit comments

Comments
 (0)