Current development on JAMWiki is primarily focused on maintenance rather than new features due to a lack of developer availability. If you are interested in working on JAMWiki please join the jamwiki-devel mailing list.

Tech comments:Parser isolation

Contents

Keep thinks in branch[edit]

If this can be done it would be a huge benefit, and I'd like to help out in any way I can. However, please don't commit any code to trunk for this one yet unless it is completely isolated from existing Wiki code. The 0.6.0 release has already dragged on for a long time, and ripping out the parser at this point would push the release out even further. I suspect that this is obvious, especially since you've already created a separate branch, but I wanted to make sure. -- Ryan 26-Jul-2007 11:55 PDT

Thanks for the precision, but I understood it the same way. This parser isoaltion is versioned as 0.6.1-SNAPSHOT.
Perfect, thanks! I just wanted to make sure there wasn't any confusion about when this change could be included in a release, but I see we were thinking the same things. -- Ryan 27-Jul-2007 09:44 PDT

Parser modifications[edit]

Moved from the Feedback page:

I play a bit with your Jamwiki parser. (See Image:wikiparser.tar.gz) It might match some of your ideas behind "Non-embedded parser" in your TODO list.

Important thing: My goal was to make the parser small(!) so:

  • I deleted as much as possible - among others the code for caching (ehcache) and "allow-javascript" functionality is deleted
  • all necessary library functions from springframework jar file are extracted to my package org.jamwiki.utils

The usage might be something like :

   import org.jamwiki.parser.ParserDocument;
   import org.jamwiki.parser.ParserInput;
   import org.jamwiki.parser.jflex.JFlexParser;
   public static void main(String[] args) {
       String text = ">>> the text to parse <<<";
       ParserInput pi = new ParserInput();
       pi.setLocale(new Locale("de"));
       ParserDocument pd = new ParserDocument();
       JFlexParser p = new JFlexParser(pi);
       try { pd = p.parseHTML(text); } catch (Exception e) {}
       String output = pd.getContent();
   }

So, is it useful for you? Do you have any comments? (my work is still not finished)

Bost

PS: From time to time I code something for an open source search machine www.yacy.net We have a little wiki system and we're looking for a better wiki than the actual (and MediaWiki compatible). That's why I played with your code...

The above message was sent to me via email, but I'm copying it here so that the discussion is more open. For my part I haven't had time to look at these changes, but several people have expressed interest in de-coupling the parser from the rest of the Wiki code so it may have some general interest.
As a side note, it would be best not to email me directly but instead to have discussions on jamwiki.org. If anyone knows how I can disable the email link from Sourceforge and thus hopefully force discussions to jamwiki.org I'd be grateful - I searched for a while but didn't see any options for disabling email links. -- Ryan 26-Mar-2007 21:37 PST

Parser integration[edit]

Copied from Tech:Parser integration:

To use the parser in a more flexible way I suggest to modifie the way the datahandler is used. By still using the DataHandler Interface but passing an instance to the ParserInput. In this way I was successful in using the parser in our project for different settings and with dynamic URL's and even dynamic image url's. After the modification there was no more need for the WikiBase singleton through the different places in the code as the code used at runtime the datahandler instance.

  • with WikiBase: WikiBase.dataHandler.exists(...)
  • without WikiBase: parserInput.getDataHandler().exists(parserInput.getVirtualWiki(), topic)

-- 130.60.112.109 28-Jul-2007 11:26 PDT

Better interface for parser configuration[edit]

Copied from the Feedback page:

Hi, thanks for the great project! I was tring to use the parser stuff with the html rendering engine in an other project. But I failed as the configuration is made singleton style by calling the WikiBase.getDataHandler() stuff. This works nice if you just use one parser per VM but in my case where we have completely dynamic URL's as we use a component framework and even image URL's are on a per user base for security reasons. It would be nice to have either a possibility to configure each parser instance by setting the stuff above with the ParserInput object or a hook for image and dynamic url's.

At least I think it would be very important for the bliki project which should be usable for external frameworks as well.

Thanks for an answer and keep going the good work!

guido

If there is a way to make the parser more flexible then we should definitely do it. I'm not sure I fully understand exactly what you're proposing regarding modifying how ParserInput is created - there needs to be some way to query existing data, so if WikiBase.getDataHandler() isn't used then some other mechanism is needed - but if you're willing, please start a Tech:Parser integration topic that discusses your specific ideas, and hopefully gives some examples of the interface you'd like to see. Thanks for the feedback! -- Ryan 15-Jan-2007 20:25 PST


Parser API or complete JamWiki API[edit]

Actually, I'm not so sure it is useful to isolate the parser. What people really want is to gain access to the JamWiki API in order to develop extensions/plugins.

At this point, I think we need a separate API, but for the whole Jamwiki engine, not only the parser.

I think there would be value in defining each component of the wiki and its interaction with other components, and there has been an attempt to start documenting this on Tech:JAMWiki Design. The original code base was not always very modular, and while it's cleaner today than it was originally there are also some new inter-dependencies that should probably be cleaned up.
With regards to the parser isolation work, I see that as a first step towards cleaning up the parser interface. Once that first step is complete we can then look at where the dependencies are, and what the best interface(s) are for cleaning up and eliminating those dependencies. The same approach would hold true for the rest of the JAMWiki code - the more modular the code base becomes, the easier it is to create clean interfaces that allow for more flexible development.
Hopefully that addresses your point - anywhere where the interfaces/API can be cleaned up is good, and I'd encourage discussion of how best to do so. I don't want to do a complete re-design, but rather incrementally move the code towards cleaner interfaces and more modularization. -- Ryan 28-Jul-2007 13:02 PDT
One more item - if by "parser isolation" you mean having the parser be a separate Maven project then I agree that might be going too far, but in your description you've stated that the goal is to "isolate the parser from the JAMWiki core". That is definitely a worthwhile goal - if (for example) the parser was simply its own JAR file that was included in the JAMWiki distribution, and no other code had any dependencies on that JAR, I think it would be a very good thing. I don't think that the core parser interfaces should necessarily be split from the rest of the project, but that might be something that has some value in the future. -- Ryan 28-Jul-2007 13:10 PDT

Tangle[edit]

The org.jamwiki.parser package contains the interfaces and objects necessary to implement ANY parser for JAMWiki. The two currently implemented parsers are:

  • org.jamwiki.parser.jflex
  • org.jamwiki.parser.bliki

In my view the steps needed to achieve "parser isolation" would be to first make the org.jamwiki.parser.jflex its own project which could be distributed as a separate JAR. Initially this would offer little or no advantage aside from splitting the code out. The second step would be to begin simplifying and eliminating dependencies of this parser project from the rest of the JAMWiki code base. I'm not sure that this could ever be done completely - templates will always need some way to look up their code, for example - but I do think it might be possible to limit those dependencies by modifying the org.jamwiki.parser code to pass as much as possible along to the actual parser implementations.

Thanks for your guidance. I have walked toward that goal yesterday:
  • org.jamwiki.parser.jflex is its own project (jamwiki-parser-jflex)
  • I was not able to do the same thing for bliki. org.jamwiki.parser.bliki depends on org.jamwiki.parser.jflex. I thought there was two independant implementations (of course, I could make biliki depend on parser-jflex, but that does not male sense. I'm proabably missing something here.
  • but to do that, I have created jamwiki-core which holds all the code (except the servlets that remain in jamwiki). The simple reason is that a project cannot depend on a war, it can only use a jar.
  • As you say, right now, this offers no advantage aside from splitting the code out. But it's already cleaner like that: jamwiki contains only servlets and a dependency to jamwiki-core
  • now, I'd like to create a jamwiki-parser-api project, that contains the full API of the parser. Each parser should depend and implement this jamwiki-parser-api. Also, the jamwiki-core will depend on jamwiki-parser-api. To do this, a couple of interfaces will proabbly need to be extracted from the classes of jamwiki-core.
  • Then my first goal will be achieve: Anyone will be able to implement a parser based on small API.
  • The end goal: Jamwiki is modularized, and it easy to add or remove a module
Régis Décamps 01-Aug-2007 05:32 PDT


I may not be fully understanding the end goal, so let me know if I'm on the wrong track or not making sense - I hit my head a lot as a kid, so I get confused frequently :-P -- Ryan 30-Jul-2007 21:15 PDT

I'm glad you are confused too, I had some doubts myself a couple of days ago (see above conversation)
This all sounds good. I unfortunately haven't had a lot of time available to look at your changes (I look at everything that lands on trunk, but I'm not as good about looking at branches) but the idea sounds right, and based on the code you've committed to trunk it seems like you've got a good grasp of what you're doing. Let me know if there's anything in particular you want me to look at, otherwise I'll trust your good judgment. -- Ryan 01-Aug-2007 08:53 PDT