0

I'm trying to read a sql file but it keeps giving me the error:

UnicodeError: UTF-16 stream does not start with BOM

I've created a fxn to read sql files specifically:

import pandas as pd
import pyodbc as db
import os
import codecs

def sql_reader_single(qry_file, server_name, database, encoding='utf16'):
    server = db.connect(str('DRIVER={SQL Server};SERVER='+server_name+';DATABASE='+database+';'))
    with codecs.open(qry_file, encoding=encoding) as qf:
        data = pd.read_sql(qf.read(), server)
    return data

then I called it to read data:

Data = sp.sql_reader_single(qry_file=QryFile, server_name='my_server', database='my_db')

what am i doing wrong?

I've looked into:

utf-16 file seeking in python. how?

and tried both utf-16-le or utf-16-be, but I would get an error with a bunch of japanese/chinese characters like this:

pandas.io.sql.DatabaseError: Execution failed on sql '䕓䕌呃ഠ 楤瑳湩瑣਍††⨠਍†剆䵏䔠坄䔮坄䘮捡剴捥楥楶杮潇摯⁳牦൧': ('42000', "[42000] [Microsoft][ODBC SQL Server Driver][SQL Server]Incorrect syntax near '0x0a0d'. (102) (SQLExecDirectW)")

the sql file contains a very simple query, like this:

SELECT distinct *
  FROM FactReceiving
clinomaniac
  • 2,202
  • 2
  • 16
  • 22
alwaysaskingquestions
  • 1,475
  • 5
  • 20
  • 42

1 Answers1

0

Try to read the file as UTF-8.

clinomaniac
  • 2,202
  • 2
  • 16
  • 22
  • the reason why i used utf-16 was because previously my query wont be read if it was utf-8 and only works w utf-16; i am very confused why sometimes works w 16 sometimes works w 8; not like i write in two different languages..... but i still really appreciate the help!!! – alwaysaskingquestions Feb 26 '18 at 21:58
  • I am not sure how the files were created and how they might be different. Usually the first few bytes in a file tell them apart. They are called BOM. If you open a file in Notepad++, you are able to choose which encoding to use. – clinomaniac Feb 26 '18 at 22:00