[Israel.pm] Starting XML

Jason Elbaum jason.elbaum at gmail.com
Sat Dec 15 22:18:18 PST 2007


Thanks to everyone for the tips.

To be more specific...

On Dec 14, 2007 4:10 PM, Yona Shlomo <yona at cs.technion.ac.il> wrote:
> What is an example to what you're trying to do?

I'm reading a set of files which contain the results of a multiplayer
team game.

For example, one file for each game contains the setting of the game
(date, time, location, etc):

<game type="R" time="17:00" date="010203">
	<team code="abc" id="1915" name="Team1" />
	<team code="def" id="1909" name="Team2" />
	<location id="3270" name="Some city" />
</game>

Another file contains the lists of player names and details:

<team id="..." name="...">
	<player id="123456" first="..." last="..." role="P" status="A" .../>
        ...
</team>

Another file contains the list of game events, with the player(s)
involved in each event, the type of event, its effect on the game
state and a text description (including various details that don't
interest me):

<events>
<event num="1" playerid="123456" des="Something happens to some
players.  " type="Something"><effect id="567890" start="state1"
end="state2" score="2"/></event>
...


My goal is to read the relevant game data and produce statistical
reports of various types, such as the average points scored on each
type of event, perhaps broken down by team or location or player or
time of day or range of dates or game state. I don't know in advance
what analyses I'll want to run, so I'm looking to develop a general
infrastructure to support them.

I'm thinking of a stream and filter model, where the file reader will
produce a stream of GameEvent objects containing all the relevant
details of a single event, and the client code will receive the stream
and (optionally) filter the ones of interest to it. This has the
drawback that the filtering takes place only after the file is read,
so it's inefficient if the client is selecting a particular set of
games, but the data set is currently small enough that this doesn't
bother me.

Alternately, I might dump all the event data into a relational
database and then filtering can be done by querying the database.


In any case, the XML side only involves reading the above file types
and extracting the data items of interest.

Obviously, I can also do this with a set of regular expressions, but
it seems wasteful and unreliable.

Regards,

Jason



More information about the Perl mailing list