31f98ce9303a

Update
[view raw] [browse files]
author Steve Losh <steve@stevelosh.com>
date Tue, 25 Feb 2020 11:26:40 -0500
parents 0f1de4b551b0
children 21840941f626
branches/tags (none)
files README.markdown

Changes

--- a/README.markdown	Mon Feb 24 23:56:15 2020 -0500
+++ b/README.markdown	Tue Feb 25 11:26:40 2020 -0500
@@ -1013,3 +1013,23 @@
 out of sync because we grepped out the overrepresented sequences naively.  Tried
 restarting the alignment on the original data and it's still going, so that's
 probably it.  Need to figure out how to filter those bad reads properly tomorrow.
+
+## 2020-02-25
+
+Class.  Chatting about QC and such.
+
+Professor says he asked around and people haven't seen the first 5 bases being
+lower quality before, but that the explanation from Illumina makes sense, and
+that we should *not* trim those bases unless they seem to be causing issues.
+Also talked about how to filter out the overrepresented sequences — he thought
+fastx had something for this, but I can't seem to find anything in the
+documentation.  I may need to write some code.
+
+Professor installed `pigz` on the server, so I can remove my hacky `xargs`
+workaround.
+
+Need to find a way to filter bad sequences from BOTH files at once.
+
+Need to align to the NCBI reference GTF instead of the other weird one.
+
+