Highest Voted Questions - Bioinformatics Stack Exchange

3

votes

1 answer

Running htseq-count over BAM files

I am trying to derive an expression matrix from BAM files using htseq-count. These are bulk RNASeq BAM's by the way. I have read the htseq-count documentation as well as samtools and figured that the following command should work: samtools sort -on…

asked Mar 12 '19 at 15:43

h3ab74

836
5
14

3

votes

3 answers

Tools for quality trimming at 5 prime?

I'm looking for a tool that is capable to do quality trimming at 5' end and it is configurable. For example, I can choose the quality trheshold, the read length, etc. Any recommendations? I'm currently using sickle for quality trimming. However,…

asked Mar 12 '19 at 14:44

aLbAc

196
6

3

votes

1 answer

How can I assign a genomic region into a window using R?

I have asked this question at BioStars but have not gotten any suggestions so far. So, I am asking the community here, perhaps someone here has an idea how I can solve my problem. I am dealing with a bunch of inversions and running some analysis on…

r

asked Mar 07 '19 at 16:28

Anna1364

516
2
8

3

votes

2 answers

Getting data from fastq by generator

I have a task in a training that I have to read and filter the 'good' reads of big fastq files. I downsampled, got the code working, saving in a python dictionary. But turns out the original files are huge and I rewrite the code to give a…

asked Mar 06 '19 at 14:20

Paulo Sergio Schlogl

203
2
5

3

votes

1 answer

Mix globbing and wildcards when specifying rule input

Consider the following scenario. ├── sample-alice │ ├── sequence_1.fastq │ ├── sequence_2.fastq │ ├── ... │ └── sequence_n.fastq └── sample-bob ├── sequence_1.fastq ├── sequence_2.fastq ├── ... └── sequence_m.fastq I'm…

snakemake

asked Mar 05 '19 at 17:55

Daniel Standage

5,080
15
50

3

votes

3 answers

GTF from consensus sequence

I am new to using bioinformatic tools and I was hoping this community could help clear something up for me. I need to generate a gtf file. My data are a set of complete HA genes for influenza B viruses in fasta format. From reading through forums…

asked Mar 05 '19 at 14:48

Lindsey

33
3

3

votes

1 answer

Significance and timing of "mux scans"

I'm using MinIONQC to do quality control on some ONT data. The software plots several read characteristics over time (hours passed during the sequencing process). These plots contain several vertical red lines. From the documentation: Muxes, which…

asked Mar 04 '19 at 20:39

Daniel Standage

5,080
15
50

3

votes

2 answers

Unable to install bedtools on windows 10 ubuntu

I am trying to install bedtools on windows 10, but I get an error I don't understand. How can I fix it? Building BEDTools: ========================================================= DETECTED_VERSION = v2.27.1 CURRENT_VERSION = Updating version…

asked Mar 04 '19 at 17:33

DangIt

41
3

3

votes

0 answers

Getting genes specially up or down regulated

I have 6 RNA-seq samples like this 4 patients (005, 036, 121, 013) I have 3 tumour samples and 3 cancer models (organoid) This is PCA of log transformed data by DESeq2 (tumour Vs. organoid) In this paper…

asked Mar 04 '19 at 10:55

Zizogolu

2,148
11
44

3

votes

1 answer

Time and memory efficient processing of many intervals for a bigWig file

I am trying to calculate the mappability adjusted length of introns as described by Boutz et al. Briefly, for each intron I wish to calculate the length minus the number of bases that are non-uniquely mappable. Mappability tracks can be downloaded…

bigwig

asked Jun 13 '17 at 19:39

Ian Sudbery

3,311
1
11
21

3

votes

1 answer

Why does a missing label 11plex TMT shows up at almost 50% intensity compared to other labels?

We have a test plate that was done with 10 of the 11plex TMT labels. So, the intensities of the 11th label should be almost zero, plus the leakage from the 10th label due to 13C content. Although we observe a substantial drop in the intensities, we…

asked Mar 01 '19 at 16:29

Soerendip

1,295
11
22

3

votes

3 answers

Normalization for two bulk RNA-Seq samples to enable reliable fold-change estimation between genes

I have two bulk RNA-Seq samples, already tpm-normalized. I would like to know what is a reasonable normalization procedure to enable downstream log fold-change estimation. The distribution of the two samples using the common set of genes looks…

asked Feb 28 '19 at 20:07

gc5

1,783
18
32

3

votes

1 answer

programmatic secondary structure prediction for >36-mer DNA oligonucleotides

I'm writing a tool to automate Sanger sequencing primer design for a production lab that uses a universal-tail chemistry Sanger sequencing to verify NGS results. Essentially, the template DNA is amplification using primers with universal 5' tails…

asked Feb 26 '19 at 20:56

mRotten

192
9

3

votes

1 answer

Why can I not install snakemake on my SLURM computer even though I can find it in the bioconda channel?

I searched for the package 'snakemake' on my SLURM cluster using: conda search --channel bioconda snakemake and I get many versions, up to 5.4.2. I then try to install it using: conda install --channel bioconda --yes snakemake=5.4.2 but it fails…

asked Feb 25 '19 at 13:49

Biomagician

2,459
16
30

3

votes

2 answers

Separation of mixed plasmid DNA sequences post whole-plasmid sequencing

Imagine a DNA sample containing a mixture of different intact plasmids. These samples are sequenced using either MiSeq or HiSeq sequencing. Would it possible to assemble these plasmids post-sequencing as they would have been when sequenced…

asked Feb 22 '19 at 13:31

Roelof Coertze

155
2

Most Popular