Questions tagged [project-management]

Organizing computational work on statistical projects; use for questions about data storage, data sharing, code repositories, etc. Note that questions about programming or unrelated to statistics are off-topic.

30 questions
92
votes
7 answers

How to efficiently manage a statistical analysis project?

We often hear of project management and design patterns in computer science, but less frequently in statistical analysis. However, it seems that a decisive step toward designing an effective and durable statistical project is to keep things…
chl
  • 53,725
32
votes
7 answers

Why is a comma a bad record separator/delimiter in CSV files?

I was reading this article and I'm curious for the proper answer to this question. The only thing that comes to my mind it's perhaps that in some countries the decimal separator is a comma, and it may be problems when sharing data in CSV, but I'm…
18
votes
10 answers

Strategy for editing comma separated value (CSV) files

When I work on data analysis projects I often store data in comma or tab-delimited (CSV, TSV) data files. While data often belongs in a dedicated database management system. For many of my applications, this would be overdoing things. I can edit…
Jeromy Anglim
  • 44,984
17
votes
5 answers

Simple, reliable, open, and interoperable plain text format for storing data

In a previous question I asked about tools for editing CSV files. Gavin linked to a comment on R Help by Duncan Murdoch suggesting that Data Interchange Format is a more reliable way to store data than CSV. For some applications a dedicated database…
Jeromy Anglim
  • 44,984
15
votes
3 answers

What is a practically good data analysis process?

I would like to know, or have references on, analysis process most of statistical data analysts go through for each data analysis project. If I make a "list", to complete data analysis project, an analyst has to: first collect requirements for…
Tae-Sung Shin
  • 655
  • 1
  • 9
  • 22
11
votes
3 answers

Improving variable names in a dataset

Good variable names are: a) short / easy to type, b) easy to remember, c) understandable / communicative. Am I forgetting anything? Consistency is something to look for. The way I would put it is that consistent naming conventions contribute…
8
votes
5 answers

What is a good general purpose plain text data format like that used for Bibtex?

Context I'm writing a few multiple choice practice questions and I'd like to store them in a simple plain text data format. I've previously used tab delimited, but that makes editing in a text editor a bit awkward. I'd like to use a format a bit…
Jeromy Anglim
  • 44,984