hg.stevelosh.com > .plan

--- a/README.markdown	Thu Jan 23 00:14:55 2020 -0500
+++ b/README.markdown	Thu Jan 23 23:59:18 2020 -0500
@@ -156,7 +156,7 @@
    do an ugly hack in `float-string` to make the output consistently match
    Rosalind's style everywhere.

-## 2020-01-23
+## 2020-01-22

 Started doing Rosalind problems in shell/Awk, to join in with a group of folks
 at work doing them for the first time.  The first few problems are simple, but
@@ -167,3 +167,19 @@
 Worked.  It's so nice to have a site generator I designed myself that's not
 going to change out from under me when I just want to fix something simple and
 move on with my life.
+
+## 2020-01-23
+
+Lesk book exercises, chapter 1:
+
+1. Average density would be `(/ 3r4 3r9) = 1/100000` or 1 gene every ~100kb.
+2. Parts:
+   A. Two humans would have roughly 250k differences.
+   B. A human and a chimpanzee would have roughly 3m differences.
+3. Parts:
+   A. Average density would be 2 SNPs per 1000 base pairs.
+   B. Is this a trick question?  I think 1.1% of the differences would be in protein-coding regions.
+4. Parts:
+   A. I am confused.  The glossary defines a "haplotype" as a group of closely-linked genes that are typically inherited together.  So… 1 haplotype?  But the *text* talks about a haplotype as being a combination of SNPs in a recombination-poor region.  So if "haplotype" means the combination of SNPs, not a set of genes, then I think this would be `4^10 = 1048576`.
+   B. On a diploid chromosome, you'd have two separate sets of SNPs to combine (and order doesn't matter), so I think it's  `4^10 * 4^10 / 2`.
+   C. `(1 SNP / 5 kb) * (100 kb) = 20 SNPs = 2^20 sequences = 1048576 sequences`
author	Steve Losh <steve@stevelosh.com>
date	Thu, 23 Jan 2020 23:59:18 -0500
parents	2e4a81e16a68
children	ff3116e5eac2
branches/tags	(none)
files	README.markdown