Most Popular

1500 questions
3
votes
1 answer

For phylogenetic tree construction from core-genome which one is preferable: amino-acid based MSA or nucleotide based MSA?

The genomes are from same species. Is it true that, in phylogenetic tree constructed from amino-acid based MSA (multiple sequence alignment) some information are lost, so for phylogenetic reconstruction of closely related sequences nucleotide…
Ahmed Abdullah
  • 367
  • 2
  • 8
3
votes
3 answers

Compare multiple alignment results' aligned bases

I have aligned a nanopore data set to a reference genome with graphmap, minimap2 and BLASR. The alignment results are stored in BAM files. I would like to do some concordance assessment, looking at the number of base pairs that are mapped to the…
3
votes
1 answer

Subsetting of object existing of two samples

I have a Seurat object existing of an aggregate of two samples namely; RD1 and RD2. I am trying to make a subset of each sample. But it's not really working. I manage to make a subset of the cells but I don't manage to make a subset of the whole…
ageans
  • 131
  • 4
3
votes
1 answer

How to find novel transcripts using GFFcompare?

I am trying to find novel transcripts from an RNA-seq database. Based on the advice I got, it seemed that using Stringtie for transcript assembly is a good way to go, and it supports novel transcript discovery even with the reference GTF file…
user1995
  • 173
  • 2
  • 8
3
votes
1 answer

How to select only RNA with Hetero atoms from pdb file with python?

I'm trying to separate RNA from protein in a complex protein/RNA PDB file and I want all RNA info with the hetero atoms in between the bases BUT without H20 etc. In short I want RNA part of pdb file without discontinuous lines. I managed to separate…
Raph
  • 61
  • 5
3
votes
2 answers

Error while calling bcftools mpileup - Failed to open -: unknown file type

I have sequenced a bacterial genome with a GridIon from ONT. Basically what I want to check is whether or not trimming 50 bps at the beginning of the reads will improve alignment against the reference genome and ultimately the call of a consensus…
BCArg
  • 283
  • 2
  • 12
3
votes
2 answers

Produce a single sequential FASTA sequence out of BAM

I'm having problems properly looking for a solution because I'm a layman in Bioinformatics not familiar with the terminology. I'm hoping you can nudge me in the right direction, please! Thank you very much. What I want in the end A FASTA file…
3
votes
1 answer

"perl: warning: Setting locale failed." in RepeatMasker

I'm trying to run Repeatmasker in Linux on the command line with: ./RepeatMasker -species human -alu -gff -dir /mnt/lustre/users/Analysis/RepeatMaskerOutput…
RNAdey
  • 31
  • 3
3
votes
1 answer

Occupancy of TFs with the target genes

The occupancy of SMARCD3 in the target genes listed below. I want to see average, normalized ChIP-seq signal at the promoter proximal region (1000bp upstream and downstream of the TSS). I have 4 different experimental conditions (overlayed in one…
kcm
  • 1,804
  • 12
  • 27
3
votes
1 answer

Fast processing of fastq data

I am trying to write python script for customized filtering for fastq file (size >3 GB). My proposed script is as follows: def filtering(read): time.sleep(0.1) if len(read) >= 15 and \ len(read) <= 30 and \ np.mean(read.qual)…
Lot_to_learn
  • 530
  • 3
  • 14
3
votes
1 answer

Detecting broad peaks in sRNA-seq data

What kind of tool would be appropriate do detect "broad peaks" in small RNA-seq sequencing data? MACS2 appears to be developed for ChIP-seq data, but I see that there is a --nomodel option. Would that make this program usable in my case? I suppose…
bli
  • 3,130
  • 2
  • 15
  • 36
3
votes
1 answer

RNASeq read coverage in protein space?

I used samtools depth to find the per base-pair read coverage over a number of isoform contigs from my Trinity assembly. I have also conducted a multiple sequence alignment of those isoforms using Clustal Omega DNA aligner. In my final step, I…
CephBirk
  • 151
  • 8
3
votes
3 answers

Find overlap between VCF files

I have two VCF files and I want to compare the missing rate in each of them, but I want to only look at sites that are present in both files. How could I go about getting a list of positions that are included in both VCF files and then filtering…
Sarah
  • 486
  • 1
  • 4
  • 18
3
votes
2 answers

Autodetect max number of cores and pass as an argument in Nextflow

I am creating a pipeline in Nextflow. One step is creating a pangenome with Roary. Roary takes threads as an argument and if no number of threads is supplied as an argument it defaults to one. Is there a way in Nextflow to pass the maximum number…
TW93
  • 449
  • 3
  • 11
3
votes
1 answer

qPCR: Why is fold change and standard deviation calculated after transformation?

I am analyzing data from a quantitative polymerase chain reaction (qPCR) using R. After cleaning the raw data, it looks something like this: > dput(x) structure(list(Reporter = c("FAM", "FAM", "FAM", "FAM", "FAM", "FAM", "FAM", "FAM", "FAM", "FAM",…
Samuel
  • 133
  • 6