Organizing computational work on statistical projects; use for questions about data storage, data sharing, code repositories, etc. Note that questions about programming or unrelated to statistics are off-topic.
Questions tagged [project-management]
30 questions
92
votes
7 answers
How to efficiently manage a statistical analysis project?
We often hear of project management and design patterns in computer science, but less frequently in statistical analysis. However, it seems that a decisive step toward designing an effective and durable statistical project is to keep things…
chl
- 53,725
32
votes
7 answers
Why is a comma a bad record separator/delimiter in CSV files?
I was reading this article and I'm curious for the proper answer to this question.
The only thing that comes to my mind it's perhaps that in some countries the decimal separator is a comma, and it may be problems when sharing data in CSV, but I'm…
David Gasquez
- 498
18
votes
10 answers
Strategy for editing comma separated value (CSV) files
When I work on data analysis projects I often store data in comma or tab-delimited (CSV, TSV) data files. While data often belongs in a dedicated database management system.
For many of my applications, this would be overdoing things.
I can edit…
Jeromy Anglim
- 44,984
17
votes
5 answers
Simple, reliable, open, and interoperable plain text format for storing data
In a previous question I asked about tools for editing CSV files.
Gavin
linked to a comment on R Help by Duncan Murdoch
suggesting that Data Interchange Format is a more reliable way to store data than CSV.
For some applications a dedicated database…
Jeromy Anglim
- 44,984
15
votes
3 answers
What is a practically good data analysis process?
I would like to know, or have references on, analysis process most of statistical data analysts go through for each data analysis project.
If I make a "list", to complete data analysis project, an analyst has to:
first collect requirements for…
Tae-Sung Shin
- 655
- 1
- 9
- 22
11
votes
3 answers
Improving variable names in a dataset
Good variable names are:
a) short / easy to type,
b) easy to remember,
c) understandable / communicative.
Am I forgetting anything? Consistency is something to look for. The way I would put it is that consistent naming conventions contribute…
Michael Bishop
- 2,191
8
votes
5 answers
What is a good general purpose plain text data format like that used for Bibtex?
Context
I'm writing a few multiple choice practice questions and I'd like to store them in a simple plain text data format. I've previously used tab delimited, but that makes editing in a text editor a bit awkward. I'd like to use a format a bit…
Jeromy Anglim
- 44,984