> So this is a wonderful web page that was pointed out to me > the other day! Thanks! Merging is one of the most popular things people use the Scriptome for, since Excel doesn't do it. (Pardon the long-windedness of my explanations below.) > I have 2 files that I would like to combine. File 1 contains: > chrno,marker,homozygotes,rare,observed,expected,pvalue,af > 01,rs11206709,714,6,178,169,0.2147,Cases > 01,rs11206709,746,11,158,162,0.4557,Controls > > File 2 contains counts of inheritance inconsistencies: > 01,rs10157092,1 > 01,rs1075364,2 > > File 1 contains 952,185 lines minus header and File 2 > contains 7632. The > majority of the markers don't have an inconsistency. > > I would like to merge these files together by marker and have File 1 > contain a new column of counts of inheritance > inconsistencies. I tried > using the merge command with $col1=1 but got an empty output file. A few points. 1. The files for merge tools (in fact, for most of the tools) need to be tab-separated, not comma separated. But don't panic! Just use change_any_separator_to_tab (the first tool on the Change page) to change the commas to tabs. In fact, the example for that tool already has comma as the separator so all you need to edit is the filenames. 2. For the actual merge, it sounds like you want to use merge_lines_based_on_shared_column. It takes each pair of corresponding lines from the two files and merges them together to make a new, single line with more columns. The other merge tools, like merge_intersection, take some lines from one file and print them, and after all of those, they print some lines from the other file. (See the examples for the different merge tools on the website to clarify what each one does.) 3. You'll need to enter the column number for each file to do the merge on, which in this case means $col1=1; $col2=1; 4. Once you've merged, Excel should be able to read the tab-delimited output. (Excel can read up to 65000 lines.) You can use Excel or other Scriptome tools to change back to comma-separated, get rid of some columns, etc. Please write again if you have more questions, or if things break. - Amir Karger Research Computing Bauer Center for Genomics Research Harvard University 617-496-0626 > > Thanks very much in advance, Peggy White 5/24 >