0

I am hosting a small fileserver, where users can upload documents from all around the world.

Due to problems in encoding (see otherquestion), I am asking myself if I should disallow users to upload (and on the other hand download) files not supported by CP1252 charset?

or otherwise; is it senseful to allow users upload documents with arabian or chinese letters in their filenames?

PS: they download the same file some time later (and it should have the same filename as uploaded)

Community
  • 1
  • 1
Niko
  • 1,044
  • 4
  • 24
  • 51

1 Answers1

0

You should be storing the files on disk using a randomly generated name, or let the file name be based on a hash of the file contents (good for deduplicating storage as well). You can save the original file name as meta data in a database together with all other meta data about the file (who uploaded it and things like that). Then you serve the file again using a PHP script which sets the original file name from the database in an HTTP header. This way you:

  • don't need to worry about file name sanitisation or duplication
  • file system encoding issues
  • storage duplication (if using a hash)
deceze
  • 491,798
  • 79
  • 706
  • 853
  • Thanks for these hints, one thing must work: to access die file directly by link. so if a user gets a url to the file, the php connects everytime to the db? doesn't this take too much time when he has e.g. 10 files to open? do you have a ready php script for that? – Niko Apr 15 '14 at 14:03
  • Database access should in no way be a limiting factor here if done decently. See http://php.net/readfile. Also see http://stackoverflow.com/a/20563773/476 if you want pretty links. Also ideally see https://tn123.org/mod_xsendfile/. – deceze Apr 15 '14 at 15:30
  • hmmm; do you have a reference-script for that? – Niko Apr 17 '14 at 09:10