Most Popular

1500 questions
4
votes
2 answers

Simulate fast5 MinION run

I'm planning on using Oxford Nanopore MinION sequencer to test a script that I'm working on. However, I would not like to run multiple MinION runs to validate the tool. I'd like to simulate various minION runs and get raw reads and all pore events…
andresito
  • 385
  • 1
  • 3
  • 9
4
votes
1 answer

How to obtain clusters of hierarchical heatmap when using Python?

Is there a good way of obtaining the labels (e.g. genes) within individual clusters that haven been clustered hierarchically in Python (preferentially, but not necessarily, by seaborn)? I found these difficulties highlighted elsewhere (here, and…
tsttst
  • 153
  • 6
4
votes
1 answer

Using large databases with BLAT

I'm a computer scientist working with biologists at a small school that doesn't have dedicated bioinformatics staff. I apologize if I use incorrect terminology since I have limited bioinformatics background. Our biology staff is studying a soil…
cbake
  • 41
  • 1
4
votes
2 answers

How are centromeric/telomeric regions categorized in RepeatMasker? "Other" category?

I haven't been able to find the following in the documentation at http://www.repeatmasker.org/ (1) Why are certain regions classified as "Other"? Are these regions impossible to concretely classify due to alignment issues? (2) I believe that…
ShanZhengYang
  • 1,691
  • 1
  • 14
  • 20
4
votes
2 answers

Retrieving a list of human genes having GO associations

I need a command in R to retrieve all human genes associated with a Gene Ontology entry. I tried to look for it online but did not find it.
4
votes
1 answer

Recommendations for missing value imputation - DNA methylation data

I'm looking for some options for imputation for a high-dimensional dataset of DNA methylation (bisulfite sequencing) data. Dimensions on the order of 50-100 samples x ~500,000 CpG loci/features. I've used K-nearest neighbors, but It seems that this…
Reilstein
  • 367
  • 1
  • 14
4
votes
1 answer

Using BLAT command line tool to blat split sequences

I have the nucleotide sequence: AATTGAGGCACATTTTTTTTTAGACAGTCTTGCTCTGTTGCCCAGGCTGGAGTGCAGTGGTGTGATCATAGCTCACTGCAGCCTCGACCTCCTGGGCTCAACAAAGCACACAGTGGGCGGATCCCCACCAG When I blat this on UCSC Genome Browser, the first hit is a match which spans 6572…
Set
  • 241
  • 1
  • 8
4
votes
2 answers

Transform traditional blast output to `--outfmt 6`

I have run a blastx of metagenomic databases (raw illumina reads) using the nr database. Unfortunately, I forgot to add the --outfmt 6 argument to the code and got the traditional output. Could I parse this output into --outfmt 6?
andresito
  • 385
  • 1
  • 3
  • 9
4
votes
1 answer

Adjusting phenotypes by regressing out covariates

I'm trying to use the bfGWAS tool, which analyses GWAS data and integrates functional annotations to identify casual SNPs (paper and github). In the user manual, it states: We recommend first regressing covariants out from the original …
steiny
  • 143
  • 3
4
votes
1 answer

Missing data mappings in mygene.info while trying to convert Genes Ensembl Ids to Entrez Ids

I need to convert a lot of Ensembl Ids to the relative counterpart in Entrez (e.g., ENSG00000157764 > 673). I found mygene.info and it seems what I needed. Let's see the query about ENSG00000157764 in action. We can easily find the key-value…
floatingpurr
  • 315
  • 1
  • 2
  • 7
4
votes
1 answer

Counting a specific consecutive character with its occurrence position and length

I have a sequence file and want to count consecutive character "N", with the tandem's position of occurrence and its length. Say I have a file named mySequence.fasta like this: >sequence-1…
user1414
  • 41
  • 1
4
votes
2 answers

What is the best distribution to model the FPKM values from normalized RNA-Seq data?

I know that the discrete raw counts from the RNA-Seq data are usually modeled by a negative binomial or a Poisson distribution, but what I am working on are the FPKM (Fragments Per Kilobase of transcript per Million mapped reads) values which…
user5054
  • 305
  • 1
  • 8
4
votes
1 answer

JSmol - hide "JSmol" logo

I'm using JSmol for my SimRNAweb server. I would like to hide "JSmol" logo from all small visualizations that I have there. I would like to just put information about JSmol under the panel in text.
Marcin Magnus
  • 676
  • 3
  • 11
4
votes
2 answers

Problems to Extract uniquely mapping reads from BWA MEM alignment

I did a mapping of genomic paired-end reads to a reference assembly using bwa mem. I need to extract the reads that mapped only once to my reference. For that, I have tried to follow the method proposed in another post Obtaining uniquely mapped…
Marvin
  • 41
  • 3
4
votes
2 answers

Where are .motif files from homer knownResults?

I have been using homer's findMotifsGenome.pl, but with my new version (v4.9.1) of homer I don't get .motif files in the knownResults folder. I do get them in the homerResults (de novo) folder, though. With my previous version of homer I did get…
benn
  • 3,571
  • 9
  • 28