[R] reading lisp file in R
peter dalgaard
pdalgd at gmail.com
Thu Jan 18 18:18:07 CET 2018
Yes, and the structure is obviously case-insensitive. More troublesome is probably that there can be multiple ACADEMIC-EMPHASIS entries, which can be tricky to tidify. Also one would need to figure out what is the meaning of lines like
(DEFPROP BOSTON-COLLEGE0 T DUPLICATE)
-pd
> On 18 Jan 2018, at 18:04 , Barry Rowlingson <b.rowlingson at lancaster.ac.uk> wrote:
>
> The file also has a bunch of email headers stuck in the middle of it:
>
>
> .....
>
> (QUALITY-OF-LIFE SCALE:1-5 4)
> (ACADEMIC-EMPHASIS HEALTH-SCIENCE)
> )
> -------
> -------
>
> From LEBOWITZ at cs.columbia.edu Mon Feb 22 20:53:02 1988
> Received: from zodiac by meridian (5.52/4.7)
> Received: from Jessica.Stanford.EDU by ads.com (5.58/1.9)
> id AA04539; Mon, 22 Feb 88 20:59:59 PST
> Received: from Portia.Stanford.EDU by jessica.Stanford.EDU with TCP; Mon,
> 22 Feb
> 88 20:58:22 PST
> Received: from columbia.edu (COLUMBIA.EDU.ARPA) by Portia.STANFORD.EDU
> (1.2/Ultrix2.0-B)
> id AA11480; Mon, 22 Feb 88 20:49:53 pst
> Received: from CS.COLUMBIA.EDU by columbia.edu (5.54/1.14)
> id AA10186; Mon, 22 Feb 88 23:48:44 EST
> Message-Id: <8802230448.AA10186 at columbia.edu>
> Date: Fri 22 Jan 88 02:50:00-EST
> From: The Mailer Daemon <Mailer at cs.columbia.edu>
> To: LEBOWITZ at cs.columbia.edu
> Subject: Message of 18-Jan-88 20:13:54
> Resent-Date: Mon 22 Feb 88 23:44:07-EST
> Resent-From: Michael Lebowitz <LEBOWITZ at cs.columbia.edu>
> Resent-To: souders at portia.stanford.edu
> Resent-Message-Id: <12376918538.25.LEBOWITZ at CS.COLUMBIA.EDU>
> Status: R
>
> Message undeliverable and dequeued after 3 days:
> souders%meridian at ADS.ARPA: Cannot connect to host
> ------------
> Date: Mon 18 Jan 88 20:13:54-EST
> From: Michael Lebowitz <LEBOWITZ at CS.COLUMBIA.EDU>
> Subject: bigger file part 3
> To: souders%meridian at ADS.ARPA
> In-Reply-To: <8801182147.AA08014 at ADS.ARPA>
> Message-ID: <12367705229.11.LEBOWITZ at CS.COLUMBIA.EDU>
>
> (DEF-INSTANCE GEORGETOWN
> (STATE MARYLAND)
> (LOCATION URBAN)
> (CONTROL PRIVATE)
> (NO-OF-STUDENTS THOUS:10-15)
> (MALE:FEMALE RATIO:45:55)
> ....
>
> Which dates it to 1988. Nice.
>
> Barry
>
>
>
> On Thu, Jan 18, 2018 at 9:20 AM, Peter Crowther <peter.crowther at melandra.com
>> wrote:
>
>> That's a nice example of why Lisp is both powerful and terrifying - you're
>> looking at a Lisp *program*, not just Lisp *data*, as Lisp makes no
>> distinction between the two. You just read 'em in.
>>
>> The two definitions at the bottom are function definitions. The top one
>> defines the def-instance function. Reading that indicates that it accepts
>> an atom as a name and a list of key-value or key-range-value lists as
>> properties, where they keys may be repeated to give you multi-valued
>> attributes in your result. The bottom one defines a function for removing
>> duplicate entries of the same location.
>>
>> The rest of the file (apart from the included email headers) is a whole
>> load of calls to the def-instance function. In Lisp, you'd define the
>> functions, then just run the rest of the file.
>>
>> To my knowledge, there is no generic way to read Lisp "data" into anything
>> else, because of this quirk that data can look like anything. If anyone
>> can correct me on that, great, but I'd be somewhat surprised. Therefore,
>> as David intimated, the tools you need are generic tools for handling text,
>> and you'll have to deal with the formatting yourself. If I were doing a
>> one-off transform of this file, I'd probably reach for vi... but I'm an old
>> Unix hacker. I certainly wouldn't teach that tooling. awk or perl could
>> certainly handle it; or if you want to give students a wider view of the
>> world you might wish to try ANTLR and get them to write a grammar to parse
>> the file. The Clojure grammar (
>> https://github.com/antlr/grammars-v4/blob/master/clojure/Clojure.g4) would
>> be an interesting place to start, although Terence Parr's comment of "match
>> a bunch of crap in parentheses" would probably give a flavour of what to
>> implement. Depends what else the students are learning.
>>
>> Hope this helps rather than hinders.
>>
>> - Peter
>>
>> On 18 January 2018 at 05:25, Ranjan Maitra <maitra at email.com> wrote:
>>
>>> Thanks! I am trying to use it in R. (Actually, I try to give my students
>>> experiences with different kinds of files and I was wondering if there
>> were
>>> tools available for such kinds of files. I don't know Lisp so I do not
>>> actually know what the lines towards the bottom of the file mean.(
>>>
>>> Many thanks for your response!
>>>
>>> Best wishes,
>>> Ranjan
>>>
>>> On Wed, 17 Jan 2018 20:59:48 -0800 David Winsemius <
>> dwinsemius at comcast.net>
>>> wrote:
>>>
>>>>
>>>>> On Jan 17, 2018, at 8:22 PM, Ranjan Maitra <maitra at email.com> wrote:
>>>>>
>>>>> Dear friends,
>>>>>
>>>>> Is there a way to read data files written in lisp into R?
>>>>>
>>>>> Here is the file: https://archive.ics.uci.edu/
>>> ml/machine-learning-databases/university/university.data
>>>>>
>>>>> I would like to read it into R. Any suggestions?
>>>>
>>>> It's just a text file. What difficulties are you having?
>>>>>
>>>>>
>>>>> Thanks very much in advance for pointers on this and best wishes,
>>>>> Ranjan
>>>>>
>>>>> --
>>>>> Important Notice: This mailbox is ignored: e-mails are set to be
>>> deleted on receipt. Please respond to the mailing list if appropriate.
>> For
>>> those needing to send personal or professional e-mail, please use
>>> appropriate addresses.
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide http://www.R-project.org/
>>> posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>> David Winsemius
>>>> Alameda, CA, USA
>>>>
>>>> 'Any technology distinguishable from magic is insufficiently advanced.'
>>> -Gehm's Corollary to Clarke's Third Law
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/
>>> posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>> --
>>> Important Notice: This mailbox is ignored: e-mails are set to be deleted
>>> on receipt. Please respond to the mailing list if appropriate. For those
>>> needing to send personal or professional e-mail, please use appropriate
>>> addresses.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/
>>> posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list