|
|
|
|
![]() |
| IFR Story and Mission | |||||||||||||||||||||||||||||||||||||||||
| For
more than 50 years engineers have been trying to solve the problem of machine
cognizance. If we could build machines that understand what is happening
around them, we could automate the planet!
At Colorcom, we have made great progress toward this Holy Grail. We have reached a milestone with the development of a raster to vector converter technology called IFR (Indirect Formularizing Resolution). This technology represents the solution to many critical problems currently presented by unintelligible graphic information. |
|||||||||||||||||||||||||||||||||||||||||
| Problems with Today's Technology | |||||||||||||||||||||||||||||||||||||||||
| Why is visual information unintelligible? Suppose you were having a telephone conversation with a colleague who was in another city and you asked him about the weather. If he was a computer, his response would be, "I am looking out over the horizon and the first dot in the upper left corner is 75 parts red, 15 parts green, and 192 parts blue." Continuing a description of the weather in this way would probably consist of an additional 1,000,000 unrelated dots. As described, this is not a very efficient answer to intelligent recognition, but it does represent the current state of affairs utilizing raster data. | |||||||||||||||||||||||||||||||||||||||||
| What is Raster? | |||||||||||||||||||||||||||||||||||||||||
| In
the weather example, the computer was communicating using raster data (dots
or pixels). Almost all sensor data used by computers is in this form. Since
raster data is unintelligible to the computer, there is a limited ability
for the computer to do anything with the data. In other words, the computer
can sense a number of things and often in significant detail, but it cannot
understand the data.
In Figure A, we as humans, would see an outline of a diamond in the picture, but a computer would have a difficult time recognizing and explaining the picture. |
|||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||
| A
machine vision program might define the image by taking the area to a perimeter
ratio of the black. An OCR (Optical Character Recognition) program might
call it an "O".
As shown by the two images in Figures B (1&2), computers have a difficult time understanding and working with raster data. |
|||||||||||||||||||||||||||||||||||||||||
Fig. B (1) |
|||||||||||||||||||||||||||||||||||||||||
| Note the distortion that occurs when zooming in on a raster based image. This occurs because of the enlargement of pixels or dots in the image. | |||||||||||||||||||||||||||||||||||||||||
Fig. B (2) |
|||||||||||||||||||||||||||||||||||||||||
| What is Vector? | |||||||||||||||||||||||||||||||||||||||||
| Computer
processors use transistors that use on/off logic, represented by ones and
zeros respectively, as shown in Figure C.
Transistor combinations achieve different logic structures.
For example, we can combine them together with an AND gate; (1 AND 1 = 1) and (1 AND 0 = 0). There are also OR and NOT gates. With these three basic logic tools, AND, OR and NOT, complex logical problems can be handled such as an adder, subtracter, or arrhythmic logic unit. For this reason, computers are efficient at mathematics and can provide a wide array of mathematical tools. One set of these tools is an array of algorithms that work on pictures held in vector or mathematical format. In Figure C, the X is illustrated as a raster image on the left and a vector image on the right. |
|||||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||||
| Vector data represents images in math form, which in a limited way, is understood by computers. Given vector data, the computer can be instructed to do many things with images, including manipulation, animation, lightening, rendering, etc. | |||||||||||||||||||||||||||||||||||||||||
| Raster to Vector Today | |||||||||||||||||||||||||||||||||||||||||
| Almost all data from cameras, microphones, scanners and other sensors are presented to the computer in raster format. Since computers are unable to understand, analyze, or manipulate raster data, they are in a sense handicapped, in that they are deaf, blind and unfeeling. Sensors that far exceed human capability are common-place, but until now there has not been a method for getting computers to understand raster data. | |||||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||||
| Without
IFR technology, algorithms cannot bridge this tremendous gap. Manual methods
are available for understanding data, but they are extremely time-
consuming. There are a few programs that are able to convert only simple black and white images to math, but they are unable to convert complicated gray scale and color images to a math form. |
|||||||||||||||||||||||||||||||||||||||||
| Conversion Problem | |||||||||||||||||||||||||||||||||||||||||
| The
need to convert from raster to vector started to reach the critical point
in the 1940s. Since then, there have been thousands of attempts to solve
this difficult conversion problem.
The reason the problem is so difficult is that there are so many combinations of patterns that raster data can take. For example, a 3 by 3 pixel picture matrix with nine colors has about 21,000 possibilities. A raster to vector converter has to convert each of these possibilities into an equation. If just one more pixel is added, the number of possibilities increases exponentially. A picture can have millions of pixels, making the level of complexity very extreme. |
|||||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||||
| In the examples to follow, we will review various other methods of converting raster data to vector form. Figure D above, points out that each solution takes a shortcut which restricts the vector output to some minute part of the total picture. This reduces the number of possibilities and allows for a partial solution that is useful only in a few applications. | |||||||||||||||||||||||||||||||||||||||||
| Limitations of Competitive Solutions | |||||||||||||||||||||||||||||||||||||||||
| Before
the computer age, engineering drawings were designed on paper. Today, there
is a significant demand to convert these paper drawings to a CAD (Computer
Aided Drafting) format. CAD programs operate on vector data. This need
has spawned a number of raster to vector converters for the CAD market.
As shown in Figure E, the vector data, in most instances, is limited to horizontal lines. In a drawing comprised of gray lines on white paper, the horizontal vector line starts on the left side of the gray and extends to the right side of the gray. |
|||||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||||
| CAD programs have become more sophisticated, offering dimensions and other important data, but are still limited to tedious hand-drawn input. | |||||||||||||||||||||||||||||||||||||||||
| The
use of wavelets was one of the earliest and most successful approaches
for providing limited raster to vector conversion.
Wavelets sift through the raster data to find each place where the image changes color. These color changes are called borders. Each part of a raster image that is comprised of the same color is called a blob. If a map of the United States depicted Texas as red and Louisiana as blue, each state would be a blob and their perimeters would be borders. Wavelets represent the borders as mathematical equations. |
|||||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||||
| This
method only works well for a few scan lines. With each scan line there
are several different ways that the border can turn. Therefore, over several
scan lines, the math gets so complicated that it becomes impractical.
As shown in Figure F, wavelets would have no trouble with the north part of the Texas-Louisiana border because it is a straight line, but the Sabine River defines the rest of the border. Therefore, wavelets have to try to generate math that would define the Sabine River. After a few twists and turns, the math gets too complicated to be of much use. Wavelet designers have tried to get around this problem by defining the river with small sections of independent equations, but this solution comes up short. |
|||||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||||
| As shown in Figure G, if the picture of the border between equations 3 and 4 is zoomed, the picture breaks apart because equation 3 is not tied to equation 4. Upon computational completion, wavelets provide a number of unrelated equations to represent the total raster image. | |||||||||||||||||||||||||||||||||||||||||
| The
fractal method, although limited, has been deemed the most straightforward
approach for representing an image. Fractal math takes each part of the
image that needs to be aggregated and substitutes a geometric shape from
a library of shapes. Fractals are considered to be a raster to vector converter
because the geometric shapes are in a mathematical format.
Fractal solutions are restricted because the library of shapes used is very limited. This is in comparison to all of the possible shapes required to accurately create a mathematical picture. Thus, as shown in Figure H, images created with fractals are fragmented and made up of non-related objects. |
|||||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||||
| The following Figures I (1 & 2) illustrate how a fractal image fragments, or breaks apart when it is magnified. This is because of the non-related objects that comprise the image. | |||||||||||||||||||||||||||||||||||||||||
Fig. I (1) |
|||||||||||||||||||||||||||||||||||||||||
Fig. I (2) |
|||||||||||||||||||||||||||||||||||||||||
| The
automatic Bezier curve technique is perhaps the most common type of raster
to vector conversion. Unlike fractals or wavelets, Bezier techniques cannot
convert complicated pictures without loss. Like IFR, Bezier can sequence
equations together over many pixels; however, unlike the lossless conversion
through IFR, the Bezier result is lossy.
The Bezier method, as shown in Figure J, identifies the borders of each blob in the image. These borders, though somewhat arbitrary, are located and tagged. These tags then become the new representation of the image. When the output image is rasterzied for display, Bezier curves are used to connect the tags when reconstructing the image in vector form. |
|||||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||||
| The problem with this approach is that it has a hard time dealing with complicated images. When tags are placed randomly, some definitions are lost. In a complex image, many tags are placed, resulting in a great deal of loss in picture definition. As shown in Figure K below, this inefficient tagging process is easily recognized viewing a closeup of this picture that has been converted to vector form using Bezier conversion. | |||||||||||||||||||||||||||||||||||||||||
Fig K |
|||||||||||||||||||||||||||||||||||||||||
| A more severe problem with the Bezier curve approach is that the solution only looks at simple borders. In a complex image, there is usually an interaction of shades that catches the eye. The Bezier curve method treats all of these as either borders or points. This often causes the Bezier conversion to be more complicated than the original raster picture representation, resulting in a solution that is actually worse than the problem. | |||||||||||||||||||||||||||||||||||||||||
| What is IFR Technology? | |||||||||||||||||||||||||||||||||||||||||
| The
IFR approach is different, in that the raster data is processed in an exact
fashion, using a massive digital front-end. Taking over 14 years to develop,
IFR understands any combination of raster data that can be presented. It
is a total solution approach.
Through IFR, raster data is translated into a more abstract data form. These data representations are then transformed into several other representations and then finally converted into vector form. Since people perform raster to vector conversions quickly and easily in their heads, the challenge was how to learn from this process and reproduce the conversion. This translation could not be accomplished if humans had perfect recognition. IFR development is based on the fact that humans make mistakes in their perception. We have translated this understanding into a new and revolutionary technology called IFR. From a philosophical viewpoint, if the technology works with visual perception, then it should work when applied to the other senses. Experimentation with IFR supports this philosophy. We believe that IFR is equally applicable with many digitized sensors such as sound, touch, humidity, acceleration, etc. The possibilities seem endless. |
|||||||||||||||||||||||||||||||||||||||||
| Artifact Aggregation | |||||||||||||||||||||||||||||||||||||||||
| In
all sensor data, an intelligent aggregation of the data needs to be performed
before meaningful information can be extracted.
Looking at a complex raster picture where every dot is a different color from any of its neighbors, we would need to see similarities between the selected dots and their neighbors before we could decipher artifacts or images in the picture. Even if a theoretical "super-human" could see all of the pixel to pixel color changes in a picture, he would need to be able to ignore the small color changes between pixels to understand or decipher the picture as well as mere mortal. Intelligence demands that the pixels be aggregated before artifacts can be extracted from a picture. In order to fully understand the picture, the "super-human" would need to generalize similar colors and classify them by lumping them into categories. This process is the first step toward computers gaining intelligent understanding of the world around them. Our internal system of generalization has much to do with why certain colors seem to clash with others. By looking at clashing colors, we can learn about how humans generalize and classify colors. The visual information in pictures gives us clues that allow us to lump together some adjacent pixels and make it easier to decipher the picture. |
|||||||||||||||||||||||||||||||||||||||||
| Color Complexity | |||||||||||||||||||||||||||||||||||||||||
| One
of the first questions asked about IFR is how the technology handles complex
colors. In other words, how are images handled in a 24-bit color picture
where almost every pixel has a different color?
A strategy that is critical to the overall success of IFR is to keep machine cognizance relevant to human perception. Humans only see somewhere between 17 and 19 bits of color. The last 5 or 6 bits of a 24-bit color palette are wasted except for various interpolation schemes that increase resolution. If a picture is perceived in a low noise environment such as scanning, there is not much fluctuation that exceeds 4 to 8 (i.e. 2 or 3 bits) color levels. Therefore, in the case of scanning, we can aggregate at perhaps 20 bits of color where we would be above the color noise and beyond human ability. If a worst case analysis is done from a theoretical noise perspective, about eight color levels are needed to get around the noise. As a test, we filmed video using a poor quality VHS camcorder. This was sampled at 4 times the color burst which added a great deal of noise. Over-sampling of clocked data can cause bias due to the violation of the Nyquist sampling theorem. There is at most one color per quadrant of the wave cycle. We found that the video noise peaked at about six color levels at 24-bit color. As a sanity check, we did the same test on a weak aerial TV signal. Theoretically, this noise would be less than that of the camcorder because of the poor color-burst phase lock on old VHS systems. Our suspicions were correct. The noise on this picture was 4 to 5 color levels. Slew rate distortion (e.g., a slow video amp) can enter into this picture. There are cases where color trends, like an intended smear, can exist without adjacent pixels being the same; however, it is easy to distinguish between invalid slews and valid smears. IFR includes algorithms that handle all of these concerns. They are implemented as simple state machine functions that do not require a math library. IFR eliminates the color problem with a color generalization algorithm that lumps adjacent color pixels together. In some ways this could be considered lossy; however, it should be considered intelligent. In the section entitled "Artifact Aggregation," we pointed out that a picture must have artifacts to have any visual intelligence. This does not occur unless adjacent pixels are lumped together to decipher the artifacts. |
|||||||||||||||||||||||||||||||||||||||||
| Smears | |||||||||||||||||||||||||||||||||||||||||
| IFR
intrinsically borders the smears and treats them as blobs, but the first
version of IFR will not have the ability to handle adjacent smears efficiently.
This is not a problem; it is just that current IFR development has not
placed this as a high priority.
In the various video pictures that were tested, there were no smears to be found except for the intentional smear that was part of the video color pattern generator test, and this smear was not adjacent to another one. |
|||||||||||||||||||||||||||||||||||||||||
| Over Aggregation | |||||||||||||||||||||||||||||||||||||||||
| The
solution is part technical and part social. With current state of the art
computing, it takes the computer an inordinate amount of time to render
a complex picture. IFR has the ability to over-aggregate colors, which
eliminates insignificant detail.
Many pictures contain insignificant details that are not needed or desired. When IFR compression is taken into the lossy mode, aggregation can actually become an effective advantage. For example, a high-resolution scan of a 35mm photograph might show creases and wrinkles in the backdrop. These slight shade differences are easily eliminated when the picture is over-aggregated. |
|||||||||||||||||||||||||||||||||||||||||
| IFR Development | |||||||||||||||||||||||||||||||||||||||||
| (1986-1988)
Search for the complete data set |
|||||||||||||||||||||||||||||||||||||||||
| Initially, IFR was pursued as an experiment. The goal was to determine if all possibilities in an image could be represented in a truth table. We knew that the data needed to be translated into abstract form. The search was to find the form.The driving force behind this endeavor came from a philosophical principle from the ancient Greeks that all things are made up of parts. For example, matter is made up of molecules. Molecules are made up of atoms and the process goes on and on. Pictures are made up of a few simple parts. These parts were finally identified and defined in 1988. We called these simple parts "the complete data set." | |||||||||||||||||||||||||||||||||||||||||
| (1988-1990)
Build the resolution enhancement program |
|||||||||||||||||||||||||||||||||||||||||
| In this period we wanted to familiarize ourselves with and work with the parts of the complete data set. With this in mind, we decided to build a resolution enhancement program which was completed in 1990. | |||||||||||||||||||||||||||||||||||||||||
| (1990-1992)
Find a true raster to vector solution |
|||||||||||||||||||||||||||||||||||||||||
| In the summer of 1992, after two challenging years of searching for answers, we began to find what we needed to solidify the concepts behind the technology. By December of 1992 we had a method (on paper) for converting raster images to vector form. | |||||||||||||||||||||||||||||||||||||||||
| (1993-1996)
Implement the first prototype |
|||||||||||||||||||||||||||||||||||||||||
| Since 1993, the methods behind the technology have been greatly refined. This stage of IFR development took longer than was originally expected because after each working implementation we continually found better implementation routines. Certain parts of the technology have been reworked more than thirty times. This will ensure an exhaustive solution to the need for an accurate raster to vector converter. | |||||||||||||||||||||||||||||||||||||||||
| (1996-1999)
Develop the Data Tagger |
|||||||||||||||||||||||||||||||||||||||||
| Since
June of 1993, the technology has been on a ten-step development path
to implement IFR Technology. Demos 1, 2 and 3 were completed in 1996.
Demo 4 was achieved in July of 1998. Demo 5 was completed in February 1999.
The data tagger (raster to vector converter) successfully completed all of
its assigned tests in August of 1999.
|
|||||||||||||||||||||||||||||||||||||||||
| (1999-2001)
Develop the Color Segmenter |
|||||||||||||||||||||||||||||||||||||||||
| A
color segmenter groups colors together. When a human, machine, or anything
else looks at a picture, the colors must be grouped together to see things in
the picture. For example, if we couldn't tell which colors belong to an
animal in the forest, we wouldn't be able to see the animal.
Color photographs, video frames, and other types of data from electronic sensors, have a lot of noise in the picture. In fact, in many cases there are no two adjacent dots that are the same color in the picture. Without any noise reduction, the mathematics would only extend over one dot which would be meaningless. Our color segmenter needs to group colors together with human-like accuracy, and at the same time leave the picture at photographic quality. Over the last 30 years there has been at least a couple of hundred attempts at color segmentation. The problem is normally considered impossible among large companies, but it is a favorite topic among PHD dissertations. The color segmenters on the market are pretty coarse. For example, a photo of a barn will group all of the red of the barn together into one color, but there should be many shades of red because of texture and lighting among other things. Our color segmenter works at photographic quality. In fact, we improve the original photograph. To back up this claim, we took a worst case photo which was a fuzzy picture of a rose. Slight and gradual color changes are hard to segment as demonstrated in the barn example above, and fuzzy pictures are harder to segment than clear pictures. The before and after pictures of this test case represent segmentation that you won't find anywhere else. The color segmenter was completed in June of 2001. |
|||||||||||||||||||||||||||||||||||||||||
| (2001-Current)
Develop the Blob Compressor |
|||||||||||||||||||||||||||||||||||||||||
| The
color segmenter takes most of the noise out of the image, but there is still
significant noise left. For example, we printed a 3 color picture and scanned
it back in. These before and after pictures
show that most of the noise is taken out of the picture, but the text and
lines are still fuzzy. In fact, if a page full of identical 'e's on a word
processor were printed out and scanned in, they would all be different. The
computer doesn't want to think about the differences in all those 'e's
anymore than anyone else does. The blob compressor changes similar 'e's into
indentical 'e's.
The color segmenter and blob compressor are noise reduction filters that respectively group colors and shapes together. By restoring the image, we can tag specific letters, words, sentences, paragraphs, or even pages. Therefore, we can obtain much greater compression than the current standard statistical compressor, TIFF group 4. By combining the color and shape grouping together the technology can recognize color text (OCR). Interframe video compression should be profoundly impacted by this new technology. |
|||||||||||||||||||||||||||||||||||||||||
| Schedule of Core Engineering Demos | |||||||||||||||||||||||||||||||||||||||||
|
Demo 6: The blob compressor is complete with the following charcteristics.
Demo 7: The
auto-set feature of the color segmenter is
Demo 8: The glider is completed.
Demo 9: Initiates product
release to development engineering of audio streaming. Demo 10:
Initiates product release to development engineering of vector based
video streaming.
|
|||||||||||||||||||||||||||||||||||||||||
| Looking Ahead | |||||||||||||||||||||||||||||||||||||||||
| We
plan to have IFR technology available in marketable software module form
in 2005. This code will then be available to OEMs for use in their products.
Marketing of the current technology will expand as the development of IFR
continues to its full potential.
Below is the current roadmap for the technology. In all instances, software implementations will occur before hardware implementations. All compression number estimates are based on complex images, such as the speed-skater picture (Figure B, approx. a 380 by 330 pixel image) seen previously in this white paper. |
|||||||||||||||||||||||||||||||||||||||||
| Raster to Vector Picture Conversion: | |||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||
| IFR Application to Other Sensors: | |||||||||||||||||||||||||||||||||||||||||
| Soon, efforts will be made to implement IFR technology within other data forms including: | |||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||
| Raster to Vector Conversion of Video: | |||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||
| This process creates an intelligent library of shapes that are used in a single image. | |||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||
| Inter-frame Compression: | |||||||||||||||||||||||||||||||||||||||||
| Artifacts from multiple pictures are stored in a single library. | |||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||
| Static Identification: | |||||||||||||||||||||||||||||||||||||||||
| Identify artifacts in a picture. | |||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||
| Dynamic Identification: | |||||||||||||||||||||||||||||||||||||||||
| Identify artifacts in video. | |||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||
| Applying IFR Technology | |||||||||||||||||||||||||||||||||||||||||
| IFR
solves a problem that has been growing for more than 50 years. For many
years, there has been a need for a raster to vector converter that could
seamlessly sequence across a large, complex picture. Computers have served
us well in many ways by bringing us to the information age. With the help
of a raster to vector converter, we can start to effectively process the
information that we have been collecting and storing.
With this in mind, IFR will have a dramatic impact on many markets. We have identified over 150 market segments that can immediately benefit from IFR. These benefits can be broken into four categories: |
|||||||||||||||||||||||||||||||||||||||||
| Current lossless compression using conventional methods on a complex TV image is about 3 to 4 times. IFR will offer a compression rate of 120 times, with expected rates as high as 800 times in the long term. | |||||||||||||||||||||||||||||||||||||||||
| A problem with raster data is that it is set at a fixed resolution. If the picture is converted to vector form, it can overcome this resolution problem in the same way that PostScript allows for the scalability of fonts. Additionally, IFR divorces bandwidth from resolution which enhances clarity. | |||||||||||||||||||||||||||||||||||||||||
| The whole world (even what your eyes see) captures images in raster, but everything (even in your mind) processes in some form of vector. Until now, no algorithm could bridge this tremendous gap. It is possible to do it manually, but it is very tedious and time-consuming. Therefore, many art forms ranging from CAD to electronic games have suffered dramatic performance losses. IFR can convert the real world into manipulatable art formats in tenths of a second. | |||||||||||||||||||||||||||||||||||||||||
| In
their current form, computing machines are in a sense handicapped, in that
they are deaf, blind, and unfeeling. Sensors that far exceed human capability
are commonplace but until now there has not been a method for giving computers
the ability to understand data.
TodayÆs most common DSP method
is to sample a spectrum of the signal with a Fourier transform. Unfortunately,
this provides only a fraction of the information available in the signal.
Since IFR can convert raster data into a mathematical form, IFR will offer
computers the ability to understand and be aware of their surroundings.
Almost all of the electronics industry
will benefit strongly from one or more of these four primary IFR benefits.
When IFR is implemented to saturation, it will encompass about 20 billion
applications. Below are just a few examples where the benefits of IFR will
greatly enhance hardware (refer to categories previously mentioned).
|
|||||||||||||||||||||||||||||||||||||||||
| Category Reference | ||||
| HARDWARE | (1) | (2) | (3) | (4) |
| Monitors | x | x | ||
| Disk Drives | x | x | ||
| Keyboards | x | |||
| Modems | x | x | ||
| Scanners | x | |||
| Printers | x | |||
| Video cards | x | |||
| Network cards | x | x | ||
| In addition
to this list there are a number of peripherals that IFR will create, as
IFR is the first technology that translates sensory input into something
the computer can understand.
Colorcom is now encouraging qualified technical developers to participate in a partnership to implement IFR into the marketplace. |
|
| © 2003-2008 Colorcom, Ltd, All rights reserved. | |