Modified | DDT |>| WED.MAR,990310,11:55-4 | | CoSy/Home ; CoSy/Current | © Coherent Systems Inc . |
I have uploaded the files :
art1raw.txt The raw text copied from the
CSPAN HTML source for the Article 1 vote .
art1.txt Cleaned up ASCII file , Article 1 .
art3.txt Cleaned up ASCII file , Article 3 .
So you can download them , download K and do your own analyses .
In CoSy/APL : Had to do in parts so this is last iteration of reading CSPAN page source . rho Q is 20000 40000 ASCIIREAD '\COSY\ToCoSy.TXT' |>| 6363 |#| rho QW is Q RPL 'ø<tr><td><b>ø\ ' |>| 5599 |#| rho QW is QW RPL 'ø</b></td><td></td><td>ø ; '("<tr><td><b><A NAME=A>Abercrombie, Neil</a></b></td> <td></td><td>D-HI</td><td> </td><td align=center> NAY </td></tr>"|>| 4497 |#| rho QW is QW RPL 'ø</td><td> </td><td align=center>ø K : Seeing that this cleaned up the text sufficiently , and to go further in CoSy would be a pain , I decided it was a good time to try to do something practical in K . These lines can be copied into to a running K process to execute . |(| Q : 0: "C:/cosy/art1raw.txt" / read the file of HTML copied from CSPAN # Q / Count of Q . Each line becomes an item . 486 `show $ `Q / GUI Display Q 2 # Q / First couple of items in the raw file; ' |>| 2699 |#| rho QW is QW RPL 'ø</td></tr>ø ' |>| 2177 |#| rho QW is QW RPL 'ø<tr></tr><tr></tr>ø' |>| 2087 |#|
Q : _ssr[ ; "<tr><td><b>" ; "" ]' Q / K`s StringSearchReplace . Q : _ssr[ ; "</b></td><td></td><td>" ; " | " ]' Q Q : _ssr[ ; "</td><td> </td><td align=center>" ; " | " ]' Q Q : _ssr[ ; "</td></tr>" ; " " ]' Q / Note Inserted '|' character to delimit Table Data items . QW : { ( 0 , & x = "|" ) _ x }' Q / Apply an ad hoc function to split its argument where it equals the / the '|' character to each item of Q . QW : QW[ & 3 = #:' QW ] / Select those rows which split in 3 . Those are / the data rows . r : ( QW[;1] _sm "*D-*" ) & ( QW[;2] _sm "*AYE*" ) / Boolean selecting items where the second sub item matchs '*D-*' and / likewise , the third contains 'AYE' . Note , the 1st index is 0 . / The Democrat Honor Roll for Article 1 Grand Jury Perjury : ,/' QW[ & r ] / Select items of QW where r = 1 . ("Goode, Virgil H., Jr. | D-VA | AYE " "Hall, Ralph M. | D-TX | AYE " "John, Christopher | D-LA | AYE " "McHale, Paul | D-PA | AYE " "Stenholm, Charles W. | D-TX | AYE " "Taylor, Gene | D-MS | AYE ") / ,/' strings the sub items of each item back together again . / I cleaned up the spacing in the display by hand . / Republican DisHonor Roll : ,/' QW[ & ( QW[;1] _sm "*R-*" ) & ( QW[;2] _sm "*NAY*" ) ] ("Houghton, Amo | R-NY | NAY " "King, Peter T. | R-NY | NAY " "Morella, Constance A. | R-MD | NAY " "Shays, Christopher | R-CT | NAY " "Souder, Mark E. | R-IN | NAY ") / A couple of other ways to write the selections . &/ ( + QW[; 1 2 ] ) _sm ( "*D-*" ; "*AYE*" ) / AND across the 2 columns flipped , each matched with its / corresponding phrase . &/ ~ QW[;2] _sm/: ( "*AYE*" ; "*NAY*" ) / AND across each item in the column 2 stringMatched with each item / in the list ( "*AYE*" ; "*NAY*" ) . +/ ( QW[;1] _sm "*D-*" ) & ( QW[;2] _sm "*AYE*" ) 6 +/ ( QW[;1] _sm "*R-*" ) & ( QW[;2] _sm "*NAY*" ) 5 "c:/cosy/art1.txt" 0: ,/' QW / Write the cleaned up items to a text / file with sub items catinated back together . QWE : { ( 0 , & x = "|" ) _ x }' 0: "C:/cosy/art1.txt" / Read the File and split it again . QW ~ QWE 1 / Matches . / The Democrat Honor Roll for Article 3 , Conspiracy to Obstruct Justice : ("Goode, Virgil H., Jr. | D-VA | AYE " "Hall, Ralph M. | D-TX | AYE " "John, Christopher | D-LA | AYE " "McHale, Paul | D-PA | AYE " "Stenholm, Charles W. | D-TX | AYE " "Taylor, Gene | D-MS | AYE ") |)|Note that this code will work with world class efficiency on gigabyte sets of data .
I`ve had a running debate with some core K programmers about the use of
Boolean vectors in K versus traditional APLs . I`d ask how else they would
handle the logic here . I think the only real difference is that `where
~ `& converting Booleans to Indexes replacing compression :
V / iota rho V in APL .
I continue to feel the lack of true Bools in K is a substantial
limitation .
Feedback : bob@cosy.com
;
NoteComputer
NB : I reserve the right to post all communications I
receive or generate to CoSy website for further reflection .