2

My system is win 10,with R 3.5.3 and Rstudio 1.1.463,locale as below:

> Sys.getlocale()
[1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252" 

My classmate gave me a UTF8 csv file sample.csv produced in linux system,this file can be produced by php script as below:

<?php
$a=
array (
      'col1' => 12,
      'col2' =>  'Y' ,
      'col3' =>  '<p style="text-align: center;">
    <strong style="text-align: center;"><span style="color: rgb(105, 105, 105); font-family: verdana, arial, sans-serif; font-size: 13px;">版权</span></strong></p>
<p>
    <span style="color: rgb(105, 105, 105); font-family: verdana, arial, sans-serif; font-size: 13px;">bla</span></p>
<p>
    <span style="color: rgb(105, 105, 105); font-family: verdana, arial, sans-serif; font-size: 13px;"><img alt="" src="/functions/2.jpg" style="width: 400px; height: 500px;" /></span></p>
<p>
    <span style="color: rgb(105, 105, 105); font-family: verdana, arial, sans-serif; font-size: 13px;">bla</span></p>
' ,
      'col4' =>  '<br />
' );

 $fp = fopen("sample.csv", "wb");

$question_list_cols=array('col1','col2','col3','col4');

fputcsv($fp, $question_list_cols);
if (!fputcsv($fp, array_values($a))) {
        echo "fail<br />";
    }

fclose($fp);
?>

When I read sample.csv in R df<-read.csv("sample.csv",header=TRUE), I got error invalid input found on input connection.
I tried similar questions in SO, but no one is workable.

The problem caused by Chinese characters 版权. Everything is OK when I remove these Chinese Characters.

How to read utf8 csv with Chinese character in R?

kittygirl
  • 2,099
  • 2
  • 15
  • 38

0 Answers0