Able2extract: converting PDF back to CAD drawings

Disclaimer: I was politely asked by the developer of this software to feature a review on my blog. They offered a free license code, so I could test out the software properly, but provided no other compensations and did not interfere with the actual review. They were available for questions, but the review is my personal opinion.

What is Able2extract?

The software allows you to convert PDF documents into editable digital documents in various formats. They focus on Microsoft Office formats (Word, PowerPoint and Excel), but also on CAD users, by allowing conversion into the AutoCAD DWG or DXF format.
There is a Standard and a Pro version, with the major difference the addition of Optical Character Recognition (OCR) in the Pro version, which will allow conversion of graphics into characters or text.
http://www.investintech.com/able2extract.html



The program is fairly easy to use. You load a PDF, define pages or selection areas to extract and then choose an output format. The interface is simple and straightforward and the amount of options limited. For a utility program, this is not a bad thing.

The software is not free, but prices are fairly low and make sense: about $100 for a license and $35 for a one-month subscription. Quite atypical, the software runs on Windows, OSX and Linux, which can only be applauded. You can try it for free, if you want to test for yourself. There are cheaper versions, but they lack the AutoCAD format support, which would be one of the main reasons to get this software in an architecture context.

What can you expect?

You have to understand that most PDF drawings of architecture and other designs arrive from CAD systems, so for such documents, this software is useful. Only when you have old paper based documents that are simply scanned, you might not get anything out of it.

I've discussed it a bit with somebody from Investintech, the developer of the software.
  • The software converts text to real text, lines into polylines and filled shapes to hatches.
  • Arcs and circles will be approximated with line segments, as PDF has no equivalent (only lines and Bézier splines). They did suggest the possibility of further adding such functionality, if requested by users.
  • For full text, it might be better to use the conversion to Word function.
Layers can be used inside PDF, but not all CAD software actually creates them when converting. Able2Extract has the following approach:
  • There will be a default layer 0 (zero), as any AutoCAD DWG requires it;
  • B layer contains the page background (we did not have this in our DXF examples);
  • H layer contains all filled shapes (hatches);
  • P layer contains linework, as polylines;
  • T layer has text (if supported).
By nature, a PDF is not performing exactly the same drawing operations as your CAD system. But some interpretations are supported in the software, such as connecting consecutive line segments into one.

And how does it perform?

We loaded a PDF from a fairly typical site plan. This is a multi-page PDF with text and some drawings originating from CAD software (quite often AutoCAD, especially with land surveyors).


You can select any page from the document, zoom in or out, rotate the page (although this does not affect export) and use the menu to choose between different selection modes: all from document, page range or one particular page. From the toolbar, you can also define an area to select.

We get a warning that text will not be converted (as we are using the Standard Edition of the software). But the rest will be supported and vectorial. So the "image" term might confuse a bit.

Once you made your selection, you can choose the conversion type. We opt for "AutoCAD", although you don't need the AutoCAD software, as long as you can read DWG or DXF.


As mentioned by the developer, we could only check DXF and not DWG. This is a Mac limitation for release A2E v8.0.43 and will be supported with the next update.

We also enabled the "merge connected segments into polyline" as it promises to make the result more usable.

We then checked the result in AutoCAD 2014.


The first result seemed a bit disappointing, but that is due to the nature of the conversion process: the drawing used solid fills, which all have been placed on a layer "H" with white as layer colour. The linework arrived on layer "P", which was also using White.

By changing the "H" layer to grey and the "P" to red, the result looked better. Selecting some lines from the hatch revealed them as "polylines".


While we could have loved to see all lines from a single hatch to be grouped, we can't really blame the software for that. The translation from CAD files to PDF documents has certain limitations (there is no equivalent entity like an AutoCAD associative hatch in PDF).

So we have an editable, scaleable and measurable drawing. Not the way you would draw it in CAD, but usable nonetheless.

For completeness, we also loaded the same DXF in Dassault Draftsight, which is free to use and also has an OSX version, so you don't really need an AutoCAD license.


The result was exactly the same. The screenshot was taken after the layer H was turned grey and P was turned red.

To test some other features, I started from an ArchiCAD drawing printed into a layout (as that is a typical format you might receive from an architectural office, if they don't share CAD or BIM documents, that is).

Alas, here we encountered a file that made the software freeze. I've sent the file to the developer and they were able to resolve an existing bug in the software, which will be solved in the upcoming update (v8.0.44). Even then, AutoCAD could read it as an underlay, retaining the file as PDF. Since some time, Underlays in AutoCAD do support entity snapping, so you might not always need the conversion to DWG/DXF.

Here is the result with the default Color Table option, after switching the Layer colours to a more legible colour.


And with the RGB colour modus we get a more faithful colour palette:


As mentioned before, don't expect real "Arcs" or "Hatches". You'll get linework that appears the same, but which has not the same structure. The same goes for "Linetypes": a dashed line will become individual small line segments.

Pictures that were placed in the file (actually generated from the ArchiCAD 3D Window) were lost and replaced with solid fill hatches. The Vectorial 3D drawing from the ArchiCAD Perspective Window was recreated correctly with solid fills of the right colour (in the RGB colour mode conversion).

Colour from solid fills was not retained when using the Color Table/AutoCAD 2000 option, as all hatches end up on the same layer, using the "by layer" colour. If you choose RGB modus, than we get correct colour.


Here we see the white Background hatch on the B layer, selected dimensions values are MText objects (but the dimension line itself is simply linework). And the pale yellow insulation colour is also retained.

For practical usage, the line weights could best be set to zero, although keeping them in gives a more faithful visual appearance.


And this is the final result in AutoCAD 2014 without line weights included. This is more usable for drafting and further editing.


Other features?

Apart from the AutoCAD DWG/DXF formats, the software can also convert into Excel sheets, Word Documents or PowerPoint presentations. We have not tested this in detail, but assume that it will help if the original was from that application as well. There are options to interpret tables which is very valuable.

Here is an example of a list of student names from a PDF that was turned into a actual spreadsheet again. Basic formatting is kept and the cells are correctly split. The names are obfuscated for privacy reasons here.

There is also a Batch option, which I tested to convert a series of downloaded academic papers in PDF format into Word documents. Not to insinuate that I'll start copy-pasting other people's work into my own writing, off course.

The result is quite faithful to the original, including pictures, pagination and page layout.

The few files I tested, worked without any problems. Most import or conversion issues are due to inherent limitations of what is actually kept inside a PDF from the original software. And results might vary based on how they were generated: printed through a PDF printer driver or using the integrated PDF conversion command that most CAD and BIM programs have.

The selection mechanism in Able2extract works fine. I'd usually opt for page-based or window-based. When selecting parts of a table, you have to be aware that only the selected (highlighted) items will be retained. And depending on how the PDF is structured, that might not always have the same document flow as the original.

Alternatives and Comparison?

The next overview is a non-exhaustive comparison of different options with the obtained results from the same document (to compare things that are comparable).

PDF to AutoCAD Web Service

I already discussed a free online service to convert PDF documents to DWG in a previous post:
http://cad-3d.blogspot.com/2013/05/from-pdf-back-to-dwg.html

Returning to the page now, reveals ads for Able2extract. I posed the question to the representative from Able2extract and the web service operates independently from them, but they do advertise there and know that the service is using their software. Upload was free, announced conversion time was 30 minutes. I got a first confirmation after a minute. The actual result took exactly 30 minutes, by mail. I wonder if that is on purpose.

The result is indeed exactly the same: B, H, P and T layers, file structure, element properties. You simply lack any options or configuration, but also can use it for free. Here we also turned two of the layer colours to make things more visible.


Adobe Illustrator PDF and DXF support

Adobe Illustrator also has options to convert PDF to DWG or DXF, so you might already have a system that you could use. I've tested with CS4, as that one is still running OK on my laptop.

Below is the converted DXF from Able2extract loaded into Illustrator:


Here, we don't have the full colour information and the document structure is rather difficult to work with (many groups, all on a single layer).  It is therefore best to not merge the layers in the DWG/DXF import in Illustrator, so the B, H, P and T layers are retained.


If we reload the DXF without "Merge Layers" set, the layers are imported properly. If we hide the H layer, the drawing is more clearly visible.

The black background came from the white original, but I guess that is due to the strange way AutoCAD uses (and mixes) colour 7 in the palette: black in layout and white in the model. It would be a nice option to allow switching pure black and pure white, so strokes (edges) are black instead of white.



If we directly load the original PDF, we get a more faithful result: hatches become compound paths (still not actual hatches), line types become compound strokes (still not actual line types), arcs are read as Bézier splines and colour information is accurate. The document is a single layer, with many groups, so it can be tricky to further edit. The font was also not recognised and Illustrator complained about a strange shading type.


From what I heard, many students use Illustrator (or InDesign) to get rid of educational banners on PDF output from AutoCAD or ArchiCAD...

Conclusions?

The software does what it promises to do. It works straightforward and once you have your basic conversion settings set, can be done as a simple routine. The added options of Word and Excel conversion can be handy at times. The AutoCAD file structure is usable but not perfect. Colour handling is OK, although the handling of Black/White might be something to look at in detail. I would personally always remove line weights and get a more usable result from it.

The handling of the background is less ideal, but understandable when trying to get a faithful representation.

Be wary of the inherent limitations of what a PDF rendition of a CAD drawing actually entails! That said, Adobe is able to derive more accurate curved geometry from the same file, so there is still room for improvement.

And layers which are so important in CAD, are not fully recoverable, alas.