Most Popular

1500 questions
3
votes
1 answer

Running htseq-count over BAM files

I am trying to derive an expression matrix from BAM files using htseq-count. These are bulk RNASeq BAM's by the way. I have read the htseq-count documentation as well as samtools and figured that the following command should work: samtools sort -on…
h3ab74
  • 836
  • 5
  • 14
3
votes
3 answers

Tools for quality trimming at 5 prime?

I'm looking for a tool that is capable to do quality trimming at 5' end and it is configurable. For example, I can choose the quality trheshold, the read length, etc. Any recommendations? I'm currently using sickle for quality trimming. However,…
aLbAc
  • 196
  • 6
3
votes
1 answer

How can I assign a genomic region into a window using R?

I have asked this question at BioStars but have not gotten any suggestions so far. So, I am asking the community here, perhaps someone here has an idea how I can solve my problem. I am dealing with a bunch of inversions and running some analysis on…
Anna1364
  • 516
  • 2
  • 8
3
votes
2 answers

Getting data from fastq by generator

I have a task in a training that I have to read and filter the 'good' reads of big fastq files. I downsampled, got the code working, saving in a python dictionary. But turns out the original files are huge and I rewrite the code to give a…
3
votes
1 answer

Mix globbing and wildcards when specifying rule input

Consider the following scenario. ├── sample-alice │ ├── sequence_1.fastq │ ├── sequence_2.fastq │ ├── ... │ └── sequence_n.fastq └── sample-bob ├── sequence_1.fastq ├── sequence_2.fastq ├── ... └── sequence_m.fastq I'm…
Daniel Standage
  • 5,080
  • 15
  • 50
3
votes
3 answers

GTF from consensus sequence

I am new to using bioinformatic tools and I was hoping this community could help clear something up for me. I need to generate a gtf file. My data are a set of complete HA genes for influenza B viruses in fasta format. From reading through forums…
Lindsey
  • 33
  • 3
3
votes
1 answer

Significance and timing of "mux scans"

I'm using MinIONQC to do quality control on some ONT data. The software plots several read characteristics over time (hours passed during the sequencing process). These plots contain several vertical red lines. From the documentation: Muxes, which…
Daniel Standage
  • 5,080
  • 15
  • 50
3
votes
2 answers

Unable to install bedtools on windows 10 ubuntu

I am trying to install bedtools on windows 10, but I get an error I don't understand. How can I fix it? Building BEDTools: ========================================================= DETECTED_VERSION = v2.27.1 CURRENT_VERSION = Updating version…
DangIt
  • 41
  • 3
3
votes
0 answers

Getting genes specially up or down regulated

I have 6 RNA-seq samples like this 4 patients (005, 036, 121, 013) I have 3 tumour samples and 3 cancer models (organoid) This is PCA of log transformed data by DESeq2 (tumour Vs. organoid) In this paper…
Zizogolu
  • 2,148
  • 11
  • 44
3
votes
1 answer

Time and memory efficient processing of many intervals for a bigWig file

I am trying to calculate the mappability adjusted length of introns as described by Boutz et al. Briefly, for each intron I wish to calculate the length minus the number of bases that are non-uniquely mappable. Mappability tracks can be downloaded…
Ian Sudbery
  • 3,311
  • 1
  • 11
  • 21
3
votes
1 answer

Why does a missing label 11plex TMT shows up at almost 50% intensity compared to other labels?

We have a test plate that was done with 10 of the 11plex TMT labels. So, the intensities of the 11th label should be almost zero, plus the leakage from the 10th label due to 13C content. Although we observe a substantial drop in the intensities, we…
Soerendip
  • 1,295
  • 11
  • 22
3
votes
3 answers

Normalization for two bulk RNA-Seq samples to enable reliable fold-change estimation between genes

I have two bulk RNA-Seq samples, already tpm-normalized. I would like to know what is a reasonable normalization procedure to enable downstream log fold-change estimation. The distribution of the two samples using the common set of genes looks…
gc5
  • 1,783
  • 18
  • 32
3
votes
1 answer

programmatic secondary structure prediction for >36-mer DNA oligonucleotides

I'm writing a tool to automate Sanger sequencing primer design for a production lab that uses a universal-tail chemistry Sanger sequencing to verify NGS results. Essentially, the template DNA is amplification using primers with universal 5' tails…
mRotten
  • 192
  • 9
3
votes
1 answer

Why can I not install snakemake on my SLURM computer even though I can find it in the bioconda channel?

I searched for the package 'snakemake' on my SLURM cluster using: conda search --channel bioconda snakemake and I get many versions, up to 5.4.2. I then try to install it using: conda install --channel bioconda --yes snakemake=5.4.2 but it fails…
Biomagician
  • 2,459
  • 16
  • 30
3
votes
2 answers

Separation of mixed plasmid DNA sequences post whole-plasmid sequencing

Imagine a DNA sample containing a mixture of different intact plasmids. These samples are sequenced using either MiSeq or HiSeq sequencing. Would it possible to assemble these plasmids post-sequencing as they would have been when sequenced…