0

I have a large plain text file to be read in R, where all data is contained at the same line with no spaces (DNA sequence with no header). I found the next function:

readChar("filename",nchar=n)

which allows to read just the "n" first elements of the file saving a lot of time. Is there another function in R that goes further by reading just from START position to STOP one, avoiding to upload the whole file?

Tomás Navarro
  • 140
  • 1
  • 1
  • 6

1 Answers1

1

Basically no, from what i know, you need to read the whole file and then discard the characters that you don't want. For example, if you want only the first 10 letters for every line:

strsub(readChar("filename",nchar=n),1,10)

But, this post (How to efficiently read the first character from each line of a text file?) shows some ways of improving the efficiency of that.

  • 1
    Thank you Ricardo, I did not find this post, It was what I was looking for but, unfortunately, It seems not be possible reading a file from a no start position. Anyway, readChar instead of scan, improves the execution time a lot. On the other hand, I do not find any differences between stri_sub from stringi and substring from base for large files reading. Thanks again! – Tomás Navarro Oct 22 '20 at 10:01