From brian.suda at gmail.com Mon Oct 3 18:19:17 2005 From: brian.suda at gmail.com (Brian Suda) Date: Mon Oct 3 18:19:19 2005 Subject: [microformats-dev] hCa* follow links Message-ID: <21e770780510031819y318d3f18maefa8139a5d50efc@mail.gmail.com> On the discuss list someone had a question about what they should do regarding encoding events (partial listings, teasers) on their homepage which lead to full records that were marked-up with hCalendar. That got me to thinking about an hCal/hCard follow attribute value. I know my REL example is probably a bad idea, but for example: contact When an 'a' element has 'vcard' or 'vevent' it could signify to the transforming app to fetch this page too and consider it in the transformation. Is this a bad idea? would this little transforming app turn into a run-away spider? or is this any help? because a spider will fetch every page anyway, telling you that there is an hCa* on the page won't even save any processor cycles, just because there isn't an explicit mention of encoding doesn't mean there isn't (so you check for them anyway), and just because someone says there is an hCa* at the end of this link doesn't mean there is (so you check anyway)... which kinda segways into the old Auto-Discovery idea.[1] It hasn't been looked at in a while, so i'll bring it back to the forefront to see if we can get some comments and thoughts. The discuss list has had the rumblings of application support for microformats, loads of Greasemonkey script links where passed around, X2V, and rubhub mentions, but nothing about an hCalendar/hCard repository or spider. If the discussion about FINDING these microformats is solidified, then maybe some aggrigating applications might start to pop-up. I know i'd certainly be interested in building something like this! -brian [1] - http://microformats.org/wiki/hcard-brainstorming#Auto-Discovery -- brian suda http://suda.co.uk From bud at thecommunityengine.com Tue Oct 4 13:37:00 2005 From: bud at thecommunityengine.com (Bud Gibson) Date: Tue Oct 4 13:37:00 2005 Subject: [microformats-dev] hCa* follow links In-Reply-To: <21e770780510031819y318d3f18maefa8139a5d50efc@mail.gmail.com> References: <21e770780510031819y318d3f18maefa8139a5d50efc@mail.gmail.com> Message-ID: <61911836-747E-43BD-B746-E07E4926B7D5@thecommunityengine.com> What's really happening here is that we are starting to treat html pages as data. I see no problem giving a hint as to what is at the other end of a link. One small thought on autodiscovery. Microformats don't have the discovery framework of say webservices. However, microformats do identify themselves via their enclosing elements. Could one fake autodiscovery by declaring webservices that take data structures of type whatever-the-microformat-enclosing-element-is and then providing services based on a well-defined interface. ?The trick would be to have a microformat registry or federation of registries that would identify these enclosing elements. I started a bit down this path with a greasemonkey script I defined in august. Bud On Oct 3, 2005, at 21:19, Brian Suda wrote: > On the discuss list someone had a question about what they should do > regarding encoding events (partial listings, teasers) on their > homepage which lead to full records that were marked-up with > hCalendar. > > That got me to thinking about an hCal/hCard follow attribute value. I > know my REL example is probably a bad idea, but for example: > rel="vcard">contact > > When an 'a' element has 'vcard' or 'vevent' it could signify to the > transforming app to fetch this page too and consider it in the > transformation. > > Is this a bad idea? would this little transforming app turn into a > run-away spider? or is this any help? because a spider will fetch > every page anyway, telling you that there is an hCa* on the page won't > even save any processor cycles, just because there isn't an explicit > mention of encoding doesn't mean there isn't (so you check for them > anyway), and just because someone says there is an hCa* at the end of > this link doesn't mean there is (so you check anyway)... > > which kinda segways into the old Auto-Discovery idea.[1] It hasn't > been looked at in a while, so i'll bring it back to the forefront to > see if we can get some comments and thoughts. > > The discuss list has had the rumblings of application support for > microformats, loads of Greasemonkey script links where passed around, > X2V, and rubhub mentions, but nothing about an hCalendar/hCard > repository or spider. If the discussion about FINDING these > microformats is solidified, then maybe some aggrigating applications > might start to pop-up. I know i'd certainly be interested in building > something like this! > > -brian > > [1] - http://microformats.org/wiki/hcard-brainstorming#Auto-Discovery > > -- > brian suda > http://suda.co.uk > _______________________________________________ > microformats-dev mailing list > microformats-dev@microformats.org > http://microformats.org/mailman/listinfo/microformats-dev > > From brian.suda at gmail.com Mon Oct 10 10:46:04 2005 From: brian.suda at gmail.com (brian suda) Date: Mon Oct 10 10:46:26 2005 Subject: [microformats-dev] Re: [microformats-discuss] Profiles status In-Reply-To: <1f2ed5cd0510101000q440f00fdn6d8fd98781a27e72@mail.gmail.com> References: <1f2ed5cd0510100812j58ed258dld8d77d94b5cabe7c@mail.gmail.com> <434A8ABE.6080700@gmail.com> <1f2ed5cd0510101000q440f00fdn6d8fd98781a27e72@mail.gmail.com> Message-ID: <434AA8DC.5080809@gmail.com> I have moved this to the DEV list, so any further discussion should be done there. Danny Ayers wrote: >Right, that's what I had in mind. I've got an RDF store with a sweet >little SPARQL interface, how do I get microformat data into it? The >GRDDL approach works but requires XSLT for each microformat (I think >in principle any kind of parser/transformer would do, just this is the >easy/obvious approach). I made a start on one for hReview a while ago, >but didn't find much pleasure in the activity. There are patterns in >the microformats (as David is doing a good job of documenting), but a >generic XSLT is probably unfeasible. However it should be possible to >at least semi-automate per format XSLT authoring, as long as the >microformats have machine-readable profiles. But I'm now thinking it >might be easiest just to work from instance docs, doing a >parser-generator kind of thing from them. > > --- OK, i don't quite follow the last bit, but here's what i do/did in my implementation. I get an HTML page and parse out the profile attribute values. Those are URLs to the XMDPs (they don't have to be XMDPs, that was never specific by the spec[1], it is just that XMDPs are currently the only way to describe these things in the wild). Those XMDPs are fetched and run through an XSLT that actually generated XSLTs (this has been subsequently cached, so once an XSLT has been built for a XMDP, there is no need to use more bandwidth for the same thing). Then the HTML page is tested against each XSLT (the one generated by the orginal XMDP from the profile page). That simply gives a result of what it finds on the page, no validation is done. This is because, you and i know that DTSTART is a date-time, but the machine has no way to extract that information from the english-prose in the XMDP. So in the long run, each XMDP will need a man-made XSLT (or other validator). I think under the current system, there is no way to make a universal validator (which is fine by me). There are things that could be added to the XMDP to make it more machine friendly, additional information like TYPES (date-time, string, integer, etc), but then we are re-inventing XML-Schema and we want to avoid that! >Yep, sure. >Incidentally, I've made a start at coming in from the other direction, >trying to compile a list of model-format correspondences on the ESW >Wiki: >http://esw.w3.org/topic/MicroModels >(Anything you could add, I'd be grateful ;-) > > --- i will certainly look into this. For others who are trying to understand the value and importance of the profile attribute and Metamemetics, here is some recommended reading on the topic[1][2]. -brian [1] - http://www.w3.org/TR/REC-html40/struct/global.html#profiles [2] - http://www.gmpg.org/xmdp/ From danny.ayers at gmail.com Tue Oct 11 05:01:01 2005 From: danny.ayers at gmail.com (Danny Ayers) Date: Tue Oct 11 11:52:50 2005 Subject: [microformats-dev] Re: [microformats-discuss] Profiles status In-Reply-To: <434AA8DC.5080809@gmail.com> References: <1f2ed5cd0510100812j58ed258dld8d77d94b5cabe7c@mail.gmail.com> <434A8ABE.6080700@gmail.com> <1f2ed5cd0510101000q440f00fdn6d8fd98781a27e72@mail.gmail.com> <434AA8DC.5080809@gmail.com> Message-ID: <1f2ed5cd0510110501m4a5044cctfc44c76cc3609cc@mail.gmail.com> On 10/10/05, brian suda wrote: > --- OK, i don't quite follow the last bit, but here's what i do/did in > my implementation. I get an HTML page and parse out the profile > attribute values. Those are URLs to the XMDPs (they don't have to be > XMDPs, that was never specific by the spec[1], it is just that XMDPs are > currently the only way to describe these things in the wild). Well, there are the GRDDL profiles as well, but long term I guess we're looking at a single profile doc covering both requirements (as DanC has done for data-view). Those > XMDPs are fetched and run through an XSLT that actually generated XSLTs > (this has been subsequently cached, so once an XSLT has been built for a > XMDP, there is no need to use more bandwidth for the same thing). Then > the HTML page is tested against each XSLT (the one generated by the > orginal XMDP from the profile page). That simply gives a result of what > it finds on the page, no validation is done. Neat. This is because, you and i > know that DTSTART is a date-time, but the machine has no way to extract > that information from the english-prose in the XMDP. Right - but didn't I see mention of using to enable ISO dates somewhere? That should be fairly machine-readable. (I've obviously got more human reading to do...) > So in the long run, each XMDP will need a man-made XSLT (or other > validator). I think under the current system, there is no way to make a > universal validator (which is fine by me). Hmm, maybe. Depends on those patterns... > There are things that could be added to the XMDP to make it more machine > friendly, additional information like TYPES (date-time, string, integer, > etc), but then we are re-inventing XML-Schema and we want to avoid that! Right, I don't think it would be desirable to go down that path. I must admit I'm not all that sure of the value of XMDP docs as currently defined - they're not-quite human readable, not-quite machine readable. But as they are being (/will be) defined for the microformats then in makes sense to try and use them. When I get a minute I'll have a look at the instance-driven generation I mentioned (but didn't explain very well). A first step might be to transform a sample microformat doc into an XMDP doc (with placeholders/boilerplate for descriptions etc). Then that might be usable to generate an approximate microformat2rdfxml.xsl. Cheers, Danny. -- http://dannyayers.com From bud at thecommunityengine.com Tue Oct 11 13:37:48 2005 From: bud at thecommunityengine.com (Bud Gibson) Date: Tue Oct 11 13:37:43 2005 Subject: [microformats-dev] Re: [microformats-discuss] Profiles status In-Reply-To: <1f2ed5cd0510110501m4a5044cctfc44c76cc3609cc@mail.gmail.com> References: <1f2ed5cd0510100812j58ed258dld8d77d94b5cabe7c@mail.gmail.com> <434A8ABE.6080700@gmail.com> <1f2ed5cd0510101000q440f00fdn6d8fd98781a27e72@mail.gmail.com> <434AA8DC.5080809@gmail.com> <1f2ed5cd0510110501m4a5044cctfc44c76cc3609cc@mail.gmail.com> Message-ID: <69B52E96-EDC7-4772-9A1F-FB53410C2885@thecommunityengine.com> On Oct 11, 2005, at 8:01, Danny Ayers wrote: > Right, I don't think it would be desirable to go down that path. I > must admit I'm not all that sure of the value of XMDP docs as > currently defined - they're not-quite human readable, not-quite > machine readable. But as they are being (/will be) defined for the > microformats then in makes sense to try and use them. > If done right, I think XMDPs are adequate for summary human reading (sort of an "in a nutshell" version). They are not a machine format. They just do not give you enough. I pretty much agree with the diea of writing separate XSLT rules for microformat identification and validation. Generally, I think there is room to extend the original idea of microformats (really intended as a way to add moderate structure to pages for more search engine friendliness) to something that is more amenable to processing as a data structure. This could be achieved by writing automated validators for each microformat. We did something like that for xFolk with greasemonkey here: http://thecommunityengine.com/resources/xfolk-forms5.user.js There is a similar and possibly better structured effort in java here: http://www.pokkari.com/blog/2005/10/05/xfolk-for-java/ Another person, Andreas Haugstrup is in the process of doing this in PHP. Bud