From brian.suda at gmail.com Mon Oct 3 18:19:17 2005
From: brian.suda at gmail.com (Brian Suda)
Date: Mon Oct 3 18:19:19 2005
Subject: [microformats-dev] hCa* follow links
Message-ID: <21e770780510031819y318d3f18maefa8139a5d50efc@mail.gmail.com>
On the discuss list someone had a question about what they should do
regarding encoding events (partial listings, teasers) on their
homepage which lead to full records that were marked-up with
hCalendar.
That got me to thinking about an hCal/hCard follow attribute value. I
know my REL example is probably a bad idea, but for example:
contact
When an 'a' element has 'vcard' or 'vevent' it could signify to the
transforming app to fetch this page too and consider it in the
transformation.
Is this a bad idea? would this little transforming app turn into a
run-away spider? or is this any help? because a spider will fetch
every page anyway, telling you that there is an hCa* on the page won't
even save any processor cycles, just because there isn't an explicit
mention of encoding doesn't mean there isn't (so you check for them
anyway), and just because someone says there is an hCa* at the end of
this link doesn't mean there is (so you check anyway)...
which kinda segways into the old Auto-Discovery idea.[1] It hasn't
been looked at in a while, so i'll bring it back to the forefront to
see if we can get some comments and thoughts.
The discuss list has had the rumblings of application support for
microformats, loads of Greasemonkey script links where passed around,
X2V, and rubhub mentions, but nothing about an hCalendar/hCard
repository or spider. If the discussion about FINDING these
microformats is solidified, then maybe some aggrigating applications
might start to pop-up. I know i'd certainly be interested in building
something like this!
-brian
[1] - http://microformats.org/wiki/hcard-brainstorming#Auto-Discovery
--
brian suda
http://suda.co.uk
From bud at thecommunityengine.com Tue Oct 4 13:37:00 2005
From: bud at thecommunityengine.com (Bud Gibson)
Date: Tue Oct 4 13:37:00 2005
Subject: [microformats-dev] hCa* follow links
In-Reply-To: <21e770780510031819y318d3f18maefa8139a5d50efc@mail.gmail.com>
References: <21e770780510031819y318d3f18maefa8139a5d50efc@mail.gmail.com>
Message-ID: <61911836-747E-43BD-B746-E07E4926B7D5@thecommunityengine.com>
What's really happening here is that we are starting to treat html
pages as data. I see no problem giving a hint as to what is at the
other end of a link.
One small thought on autodiscovery. Microformats don't have the
discovery framework of say webservices. However, microformats do
identify themselves via their enclosing elements. Could one fake
autodiscovery by declaring webservices that take data structures of
type whatever-the-microformat-enclosing-element-is and then providing
services based on a well-defined interface. ?The trick would be to
have a microformat registry or federation of registries that would
identify these enclosing elements. I started a bit down this path
with a greasemonkey script I defined in august.
Bud
On Oct 3, 2005, at 21:19, Brian Suda wrote:
> On the discuss list someone had a question about what they should do
> regarding encoding events (partial listings, teasers) on their
> homepage which lead to full records that were marked-up with
> hCalendar.
>
> That got me to thinking about an hCal/hCard follow attribute value. I
> know my REL example is probably a bad idea, but for example:
> rel="vcard">contact
>
> When an 'a' element has 'vcard' or 'vevent' it could signify to the
> transforming app to fetch this page too and consider it in the
> transformation.
>
> Is this a bad idea? would this little transforming app turn into a
> run-away spider? or is this any help? because a spider will fetch
> every page anyway, telling you that there is an hCa* on the page won't
> even save any processor cycles, just because there isn't an explicit
> mention of encoding doesn't mean there isn't (so you check for them
> anyway), and just because someone says there is an hCa* at the end of
> this link doesn't mean there is (so you check anyway)...
>
> which kinda segways into the old Auto-Discovery idea.[1] It hasn't
> been looked at in a while, so i'll bring it back to the forefront to
> see if we can get some comments and thoughts.
>
> The discuss list has had the rumblings of application support for
> microformats, loads of Greasemonkey script links where passed around,
> X2V, and rubhub mentions, but nothing about an hCalendar/hCard
> repository or spider. If the discussion about FINDING these
> microformats is solidified, then maybe some aggrigating applications
> might start to pop-up. I know i'd certainly be interested in building
> something like this!
>
> -brian
>
> [1] - http://microformats.org/wiki/hcard-brainstorming#Auto-Discovery
>
> --
> brian suda
> http://suda.co.uk
> _______________________________________________
> microformats-dev mailing list
> microformats-dev@microformats.org
> http://microformats.org/mailman/listinfo/microformats-dev
>
>
From brian.suda at gmail.com Mon Oct 10 10:46:04 2005
From: brian.suda at gmail.com (brian suda)
Date: Mon Oct 10 10:46:26 2005
Subject: [microformats-dev] Re: [microformats-discuss] Profiles status
In-Reply-To: <1f2ed5cd0510101000q440f00fdn6d8fd98781a27e72@mail.gmail.com>
References: <1f2ed5cd0510100812j58ed258dld8d77d94b5cabe7c@mail.gmail.com> <434A8ABE.6080700@gmail.com>
<1f2ed5cd0510101000q440f00fdn6d8fd98781a27e72@mail.gmail.com>
Message-ID: <434AA8DC.5080809@gmail.com>
I have moved this to the DEV list, so any further discussion should be
done there.
Danny Ayers wrote:
>Right, that's what I had in mind. I've got an RDF store with a sweet
>little SPARQL interface, how do I get microformat data into it? The
>GRDDL approach works but requires XSLT for each microformat (I think
>in principle any kind of parser/transformer would do, just this is the
>easy/obvious approach). I made a start on one for hReview a while ago,
>but didn't find much pleasure in the activity. There are patterns in
>the microformats (as David is doing a good job of documenting), but a
>generic XSLT is probably unfeasible. However it should be possible to
>at least semi-automate per format XSLT authoring, as long as the
>microformats have machine-readable profiles. But I'm now thinking it
>might be easiest just to work from instance docs, doing a
>parser-generator kind of thing from them.
>
>
--- OK, i don't quite follow the last bit, but here's what i do/did in
my implementation. I get an HTML page and parse out the profile
attribute values. Those are URLs to the XMDPs (they don't have to be
XMDPs, that was never specific by the spec[1], it is just that XMDPs are
currently the only way to describe these things in the wild). Those
XMDPs are fetched and run through an XSLT that actually generated XSLTs
(this has been subsequently cached, so once an XSLT has been built for a
XMDP, there is no need to use more bandwidth for the same thing). Then
the HTML page is tested against each XSLT (the one generated by the
orginal XMDP from the profile page). That simply gives a result of what
it finds on the page, no validation is done. This is because, you and i
know that DTSTART is a date-time, but the machine has no way to extract
that information from the english-prose in the XMDP.
So in the long run, each XMDP will need a man-made XSLT (or other
validator). I think under the current system, there is no way to make a
universal validator (which is fine by me).
There are things that could be added to the XMDP to make it more machine
friendly, additional information like TYPES (date-time, string, integer,
etc), but then we are re-inventing XML-Schema and we want to avoid that!
>Yep, sure.
>Incidentally, I've made a start at coming in from the other direction,
>trying to compile a list of model-format correspondences on the ESW
>Wiki:
>http://esw.w3.org/topic/MicroModels
>(Anything you could add, I'd be grateful ;-)
>
>
--- i will certainly look into this.
For others who are trying to understand the value and importance of the
profile attribute and Metamemetics, here is some recommended reading on
the topic[1][2].
-brian
[1] - http://www.w3.org/TR/REC-html40/struct/global.html#profiles
[2] - http://www.gmpg.org/xmdp/
From danny.ayers at gmail.com Tue Oct 11 05:01:01 2005
From: danny.ayers at gmail.com (Danny Ayers)
Date: Tue Oct 11 11:52:50 2005
Subject: [microformats-dev] Re: [microformats-discuss] Profiles status
In-Reply-To: <434AA8DC.5080809@gmail.com>
References: <1f2ed5cd0510100812j58ed258dld8d77d94b5cabe7c@mail.gmail.com>
<434A8ABE.6080700@gmail.com>
<1f2ed5cd0510101000q440f00fdn6d8fd98781a27e72@mail.gmail.com>
<434AA8DC.5080809@gmail.com>
Message-ID: <1f2ed5cd0510110501m4a5044cctfc44c76cc3609cc@mail.gmail.com>
On 10/10/05, brian suda wrote:
> --- OK, i don't quite follow the last bit, but here's what i do/did in
> my implementation. I get an HTML page and parse out the profile
> attribute values. Those are URLs to the XMDPs (they don't have to be
> XMDPs, that was never specific by the spec[1], it is just that XMDPs are
> currently the only way to describe these things in the wild).
Well, there are the GRDDL profiles as well, but long term I guess
we're looking at a single profile doc covering both requirements (as
DanC has done for data-view).
Those
> XMDPs are fetched and run through an XSLT that actually generated XSLTs
> (this has been subsequently cached, so once an XSLT has been built for a
> XMDP, there is no need to use more bandwidth for the same thing). Then
> the HTML page is tested against each XSLT (the one generated by the
> orginal XMDP from the profile page). That simply gives a result of what
> it finds on the page, no validation is done.
Neat.
This is because, you and i
> know that DTSTART is a date-time, but the machine has no way to extract
> that information from the english-prose in the XMDP.
Right - but didn't I see mention of using to enable ISO dates
somewhere? That should be fairly machine-readable. (I've obviously got
more human reading to do...)
> So in the long run, each XMDP will need a man-made XSLT (or other
> validator). I think under the current system, there is no way to make a
> universal validator (which is fine by me).
Hmm, maybe. Depends on those patterns...
> There are things that could be added to the XMDP to make it more machine
> friendly, additional information like TYPES (date-time, string, integer,
> etc), but then we are re-inventing XML-Schema and we want to avoid that!
Right, I don't think it would be desirable to go down that path. I
must admit I'm not all that sure of the value of XMDP docs as
currently defined - they're not-quite human readable, not-quite
machine readable. But as they are being (/will be) defined for the
microformats then in makes sense to try and use them.
When I get a minute I'll have a look at the instance-driven generation
I mentioned (but didn't explain very well). A first step might be to
transform a sample microformat doc into an XMDP doc (with
placeholders/boilerplate for descriptions etc). Then that might be
usable to generate an approximate microformat2rdfxml.xsl.
Cheers,
Danny.
--
http://dannyayers.com
From bud at thecommunityengine.com Tue Oct 11 13:37:48 2005
From: bud at thecommunityengine.com (Bud Gibson)
Date: Tue Oct 11 13:37:43 2005
Subject: [microformats-dev] Re: [microformats-discuss] Profiles status
In-Reply-To: <1f2ed5cd0510110501m4a5044cctfc44c76cc3609cc@mail.gmail.com>
References: <1f2ed5cd0510100812j58ed258dld8d77d94b5cabe7c@mail.gmail.com>
<434A8ABE.6080700@gmail.com>
<1f2ed5cd0510101000q440f00fdn6d8fd98781a27e72@mail.gmail.com>
<434AA8DC.5080809@gmail.com>
<1f2ed5cd0510110501m4a5044cctfc44c76cc3609cc@mail.gmail.com>
Message-ID: <69B52E96-EDC7-4772-9A1F-FB53410C2885@thecommunityengine.com>
On Oct 11, 2005, at 8:01, Danny Ayers wrote:
> Right, I don't think it would be desirable to go down that path. I
> must admit I'm not all that sure of the value of XMDP docs as
> currently defined - they're not-quite human readable, not-quite
> machine readable. But as they are being (/will be) defined for the
> microformats then in makes sense to try and use them.
>
If done right, I think XMDPs are adequate for summary human reading
(sort of an "in a nutshell" version). They are not a machine
format. They just do not give you enough.
I pretty much agree with the diea of writing separate XSLT rules for
microformat identification and validation. Generally, I think there
is room to extend the original idea of microformats (really intended
as a way to add moderate structure to pages for more search engine
friendliness) to something that is more amenable to processing as a
data structure. This could be achieved by writing automated
validators for each microformat.
We did something like that for xFolk with greasemonkey here:
http://thecommunityengine.com/resources/xfolk-forms5.user.js
There is a similar and possibly better structured effort in java here:
http://www.pokkari.com/blog/2005/10/05/xfolk-for-java/
Another person, Andreas Haugstrup is in the process of doing this in
PHP.
Bud