# HG changeset patch
# User Steve Losh <steve@stevelosh.com>
# Date 1579841958 18000
# Node ID 008a451ebd8e48bc9057338da8d9e43c53f06eeb
# Parent  2e4a81e16a6844c8e7fa457ea5983f7188a1f319
Update

diff -r 2e4a81e16a68 -r 008a451ebd8e README.markdown
--- a/README.markdown	Thu Jan 23 00:14:55 2020 -0500
+++ b/README.markdown	Thu Jan 23 23:59:18 2020 -0500
@@ -156,7 +156,7 @@
    do an ugly hack in `float-string` to make the output consistently match
    Rosalind's style everywhere.
 
-## 2020-01-23
+## 2020-01-22
 
 Started doing Rosalind problems in shell/Awk, to join in with a group of folks
 at work doing them for the first time.  The first few problems are simple, but
@@ -167,3 +167,19 @@
 Worked.  It's so nice to have a site generator I designed myself that's not
 going to change out from under me when I just want to fix something simple and
 move on with my life.
+
+## 2020-01-23
+
+Lesk book exercises, chapter 1:
+
+1. Average density would be `(/ 3r4 3r9) = 1/100000` or 1 gene every ~100kb.
+2. Parts:
+   A. Two humans would have roughly 250k differences. 
+   B. A human and a chimpanzee would have roughly 3m differences.
+3. Parts:
+   A. Average density would be 2 SNPs per 1000 base pairs.
+   B. Is this a trick question?  I think 1.1% of the differences would be in protein-coding regions.
+4. Parts:
+   A. I am confused.  The glossary defines a "haplotype" as a group of closely-linked genes that are typically inherited together.  So… 1 haplotype?  But the *text* talks about a haplotype as being a combination of SNPs in a recombination-poor region.  So if "haplotype" means the combination of SNPs, not a set of genes, then I think this would be `4^10 = 1048576`.
+   B. On a diploid chromosome, you'd have two separate sets of SNPs to combine (and order doesn't matter), so I think it's  `4^10 * 4^10 / 2`.
+   C. `(1 SNP / 5 kb) * (100 kb) = 20 SNPs = 2^20 sequences = 1048576 sequences`