毕业论文中英文资料外文翻译文献
专 业 机械设计制造及其自动化 课 题 多自由度机械手机械设计
英文原文
Automated Tracking and Grasping of a Moving Object with a Robotic Hand-Eye System Abstract
Most robotic grasping tasks assume a stationary or fixed object. In this paper, we explore the requirements for tracking and grasping a moving object. The focus of our work is to achieve a high level of interaction between a real-time vision system capable of tracking moving objects in 3-D and a robot arm with gripper that can be used to pick up a moving object. There is an interest in exploring the interplay of hand-eye coordination for dynamic grasping tasks such as grasping of parts on a moving conveyor system, assembly of articulated parts, or for grasping from a mobile robotic system. Coordination between an organism's sensing modalities and motor control system is a hallmark of intelligent behavior, and we are pursuing the goal of building an integrated sensing and actuation system that can operate in dynamic as opposed to static environments.
The system we have built addresses three distinct problems in robotic hand-eye coordination for grasping moving objects: fast computation of 3-D motion parameters from vision, predictive control of a moving robotic arm to track a moving object, and interception and grasping. The system is able to operate at approximately human arm movement rates, and experimental results in which a moving model train is tracked is presented, stably grasped, and picked up by the system. The algorithms we have developed that relate sensing to actuation are quite general and applicable to a variety of complex robotic tasks that require visual feedback for arm and hand control.
I. INTRODUCTION
The focus of our work is to achieve a high level of interaction between real-time vision systems capable of tracking moving objects in 3-D and a robot arm equipped with a dexterous hand that can be used to intercept, grasp, and pick up a moving object. We are interested in exploring the interplay of hand-eye coordination for dynamic grasping tasks such as grasping of parts on a moving conveyor system, assembly of articulated parts, or for grasping from a mobile robotic system.
Coordination between an organism's sensing modalities and motor control system is a hallmark of intelligent behavior, and we are pursuing the goal of building an integrated sensing and actuation system that can operate in dynamic as opposed to static environments.
There has been much research in robotics over the last few years that address either visual tracking of moving objects or generalized grasping problems. However, there have been few efforts that try to link the two problems. It is quite clear that complex robotic tasks such as automated assembly will need to have integrated systems that use visual feedback to plan, execute, and monitor grasping.
The system we have built addresses three distinct problems in robotic hand-eye coordination for grasping moving objects: fast computation of 3-D motion parameters from vision, predictive control of a moving robotic arm to track a moving object, and interception and grasping. The system is able to operate at approximately human arm movement rates, using visual feedback to track, intercept, stably grasp, and pick up a moving object. The algorithms we have developed that relate sensing to actuation are quite general and applicable to a variety of complex robotic tasks that require visual feedback for arm and hand control.
Our work also addresses a very fundamental and limiting problem that is inherent in building integrated sensing actuation systems; integration of systems with different sampling and processing rates. Most complex robotic systems are actually amalgams of different processing devices, connected by a variety of methods. For example, our system consists of three separate computation systems: a parallel image processing computer; a host computer that filters, triangulates, and predicts 3-D position from the raw vision data; and a separate arm control system computer that performs inverse
kinematic transformations and joint-level servicing. Each of these systems has its own sampling rate, noise characteristics, and processing delays, which need to be integrated to achieve smooth and stable real-time performance. In our case, this involves overcoming visual processing noise and delays with a predictive filter based upon a probabilistic analysis of the system noise characteristics. In addition, real-time arm control needs to be able to operate at fast servo rates regardless of whether new predictions of object position are available.
The system consists of two fixed cameras that can image a scene containing a moving object (Fig. 1). A PUMA-560 with a parallel jaw gripper attached is used to track and pick up the object as it moves (Fig. 2). The system operates as follows: 1) The imaging system performs a stereoscopic optic-flow calculation at each pixel in the image. From these optic-flow fields, a motion energy profile is obtained that forms the basis for a triangulation that can recover the 3-D position of a moving object at video rates.
2) The 3-D position of the moving object computed by step 1 is initially smoothed to remove sensor noise, and a nonlinear filter is used to recover the correct trajectory parameters which can be used for forward prediction, and the updated position is sent to the trajectory-planner/arm-control system.
3) The trajectory planner updates the joint-level servos of the arm via kinematic transform equations. An additional fixed-gain filter is used to provide servo-level control in case of missed or delayed communication from the vision and filtering system.
4) Once tracking is stable, the system commands the arm to intercept the moving object and the hand is used to grasp the object stably and pick it up.
The following sections of the paper describe each of these subsystems in detail along with experimental results. П. PREVIOUS WORK
Previous efforts in the areas of motion tracking and real-time control are too numerous to exhaustively list here. We instead list some notable efforts that have inspired us to use similar approaches. Burt et al. [9] have focused on high-speed feature detection and hierarchical scaling of images in order to meet the real-time
demands of surveillance and other robotic applications. Related work has been reported by. Lee and Wohn [29] and Wiklund and Granlund [43] who uses image differencing methods to track motion. Corke, Paul, and Wohn [13] report a feature-based tracking method that uses special-purpose hardware to drive a servo controller of an arm-mounted camera. Goldenberg et al. [16] have developed a method that uses temporal filtering with vision hardware similar to our own. Luo, Mullen, and Wessel [30] report a real-time implementation of motion tracking in 1-D based on Horn and Schunk’s method. Vergheseetul. [41] Report real-time short-range visual tracking of objects using a pipelined system similar to our own. Safadi [37] uses a tracking filter similar to our own and a pyramid-based vision system, but few results are reported with this system. Rao and Durrant-Whyte [36] have implemented a Kalman filter-based decentralized tracking system that tracks moving objects with multiple cameras. Miller [31] has integrated a camera and arm for a tracking task where the emphasis is on learning kinematic and control parameters of the system. Weiss et al. [42] also use visual feedback to develop control laws for manipulation. Brown [8] has implemented a gaze control system that links a robotic “head” containing binocular cameras with a servo controller that allows one to maintain a fixed gaze on a moving object. Clark and Ferrier [12] also have implemented a gaze control system for a mobile robot. A variation of the tracking problems is the case of moving cameras. Some of the papers addressing this interesting problem are [9], [15], [44], and [18].
The majority of literature on the control problems encountered in motion tracking experiments is concerned with the problem of generating smooth, up-to-date trajectories from noisy and delayed outputs from different vision algorithms. Our previous work [4] coped with that problem in a similar way as in [38], using an cy - p - y filter, which is a form of steady-state Kalman filter. Other approaches can be found in papers by [33], [34], [28], [6]. In the work of Papanikolopoulos et al. [33], [34], visual sensors are used in the feedback loop to perform adaptive robotic visual tracking. Sophisticated control schemes are described which combine a Kalman filter’s estimation and filtering power with an optimal (LQG) controller which computes the robot’s motion. The vision system uses an optic-flow computation based
on the SSD (sum of squared differences) method which, while time consuming, appears to be accurate enough for the tracking task. Efficient use of windows in the image can improve the performance of this method. The authors have presented good tracking results, as well as stated that the controller is robust enough so the use of more complex (time-varying LQG) methods is not justified. Experimental results with the CMU Direct Drive Arm П show that the methods are quite accurate, robust, and promising.
The work of Lee and Kay [28] addresses the problem of uncertainty of cameras in the robot’s coordinate frame. The fact that cameras have to be strictly fixed in robot’s frame might be quite annoying since each time they are (most often incidentally) displaced; one has to undertake a tedious job of their recalibration. Again, the estimation of the moving object’s position and orientation is done in the Cartesian space and a simple error model is assumed. Andersen et al. [6] adopt a 3rd-order Kalman filter in order to allow a robotic system (consisting of two degrees of freedom) to play the labyrinth game. A somewhat different approach has been explored in the work of Houshangi [24] and Koivo et al. [27]. In these works, the autoregressive (AR) and auto grassier moving-average with exogenous input (ARMAX) models are investigated for visual tracking. Ш. VISION SYSTEM
In a visual tracking problem, motion in the imaging system has to be translated into 3-D scene motion. Our approach is to initially compute local optic-flow fields that measure image velocity at each pixel in the image. A variety of techniques for computing optic-flow fields have been used with varying results including
matching-based techniques [5], [ 10], [39], gradient-based techniques [23], [32], [ 113, and patio-temporal, energy methods [20], [2]. Optic-flow was chosen as the primitive upon which to base the tracking algorithm for the following reasons.
·The ability to track an object in three dimensions implies that there will be motion across the retinas (image planes) that are imaging the scene. By identifying this motion in each camera, we can begin to find the actual 3-D motion.
·The principal constraint in the imaging process is high computational speed to satisfy the update process for the robotic arm parameters. Hence, we needed to be able to
compute image motion quickly and robustly. The Hom-Schunck optic-flow algorithm (described below) is well suited for real-time computation on our PIPE image processing engine.
·We have developed a new framework for computing optic-flow robustly using an estimation-theoretic framework [40]. While this work does not specifically use these ideas, we have future plans to try to adapt this algorithm to such a framework. Our method begins with an implementation of the Horn-Schunck method of computing optic-flow [22]. The underlying assumption of this method is the
optic-flow constraint equation, which assumes image irradiance at time t and t+σt will be the same:
If we expand this constraint via a Taylor series expansion, and drop second- and higher-order terms, we obtain the form of the constraint we need to compute normal velocity:
Where u and U are the velocities in image space, and Ix, Iy, and It are the spatial and temporal derivatives in the image. This constraint limits the velocity field in an image to lie on a straight line in velocity space. The actual velocity cannot be determined directly from this constraint due to the aperture problem, but one can recover the component of velocity normal to this constraint line
A second, iterative process is usually employed to propagate velocities in image neighborhoods, based upon a variety of smoothness and heuristic constraints. These added neighborhood constraints allow for recovery of the actual velocities u, v in the image. While computationally appealing, this method of determining optic-flow has some inherent problems. First, the computation is done on a pixel-by-pixel basis, creating a large computational demand. Second, the information on optic flow is only available in areas where the gradients defined above exist.
We have overcome the first of these problems by using the PIPE image processor [26], [7]. The PIPE is a pipelined parallel image processing computer capable of processing 256 x 256 x 8 bit images at frame rate speeds, and it supports the operations necessary for optic-flow computation in a pixel parallel method (a typical
image operation such as convolution, warping, addition subtraction of images can be done in one cycle-l/60 s). The second problem is alleviated by our not needing to know the actual velocities in the image. What we need is the ability to locate and quantify gross image motion robustly. This rules out simple differencing methods which are too prone to noise and will make location of image movement difficult. Hence, a set of normal velocities at strong gradients is adequate for our task, precluding the need to iteratively propagate velocities in the image. A. Computing Normal Optic-Flow in Real-Time
Our goal is to track a single moving object in real time. We are using two fixed cameras that image the scene and need to report motion in 3-D to a robotic arm control program. Each camera is calibrated with the 3-D scene, but there is no explicit need to use registered (i.e., scan-line coherence) cameras. Our method computes the normal component of optic-flow for each pixel in each camera image, finds a centurion of motion energy for each image, and then uses triangulation to intersect the back-projected centurions of image motion in each camera. Four processors are used in parallel on the PIPE. The processors are assigned as four per camera-two each for the calculation of X and Y motion energy centurions in each image. We also use a special processor board (ISMAP) to perform real-time histogram. The steps below correspond to the numbers in Fig. 3.
1) The camera images the scene and the image is sent to processing stages in the PIPE.
2) The image is smoothed by convolution with a Gaussian mask. The convolution operator is a built-in operation in the PIPE and it can be performed in one frame cycle. 3-4) In the next two cycles, two more images are read in, smoothed and buffered, yielding smoothed images Io and I1 and I2. The ability to buffer and pipeline images allows temporal operations on images, albeit at the cost of processing delays (lags) on output. There are now three smoothed images in the PIPE, with the oldest image lagging by 3/60 s.
5) Images Io and I2, are subtracted yielding the temporal derivative It. 6) In parallel with step 5, image I1 is convolved with a 3 x 3 horizontal spatial gradient operator, returning the discrete form of I,. In parallel, the vertical spatial
gradient is calculated yielding I, (not shown).
7-8)The results from steps 5 and 6 are held in buffers and then are input to a look-up table that divides the temporal gradient at each pixel by the absolute value of the summed horizontal and vertical spatial gradients [which approximates the denominator in (3)]. This yields the normal velocity in the image at each pixel. These velocities are then threshold and any isolated (i.e., single pixel motion energy) blobs are morphologically eroded. The above threshold velocities are then encoded as gray value 255. In our experiments, we threshold all velocities below 10 pixels per 60 ms to zero velocity.
9-10) In order to get the centurion of the motion information, we need the X and Y coordinates of the motion energy. For simplicity, we show only the situation for the X coordinate. The gray-value ramp in Fig. 3 is an image that encodes the horizontal coordinate value (0-255) for each point in the image as a gray value.
Thus, it is an image that is black (0) at horizontal pixel 0 and white (255) at horizontal pixel 255. If we logically and each pixel of the above threshold velocity image with the ramp image, we have an image which encodes high velocity pixels with their positional coordinates in the image, and leaves pixels with no motion at zero. 11) By taking this result and histogram it, via a special stage of the PIPE which performs histograms at frame rate speeds, we can find the centurion of the moving object by finding the mean of the resulting histogram. Histogram the high-velocity position encoded images yields 256 16-bit values (a result for each intensity in the image). These 256 values can be read off the PIPE via a parallel interface in about 10 ms. This operation is performed in parallel to find the moving object’s Y censored (and in parallel for X and Y centurions for camera 2). The total associated delay time for finding the censored of a moving object becomes 15 cycles or 0.25 s.
The same algorithm is run in parallel on the PIPE for the second camera. Once the motion centurions are known for each camera, they are back-projected into the scene using the camera calibration matrices and triangulated to find the actual 3-D location of the movement. Because of the pipelined nature of the PIPE, a new X or Y coordinate is produced every 1/60 s with this delay. While we are able to derive 3-D position from motion stereo at real-time rates, there are a number of sources of noise
and error inherent in the vision system. These include stereo triangulation error, moving shadow s which are interpreted as object motion (we use no special lighting in the scene), and small shifts in censored alignments due to the different viewing angles of the cameras, which have a large baseline. The net effect of this is to create a 3-D position signal that is accurate enough for gross-level object tracking, but is not sufficient for the smooth and highly accurate tracking required for grasping the object.
英文翻译
自动跟踪和捕捉系统中的机械手系统
摘要——许多机器人抓捕任务都被假设在了一个固定的物体上进行。在这篇文章中,我们探究了要对一个运动物体进行跟踪和抓捕所需要的必要条件。我们这次工作的重点是如何在一个能够在三维上跟踪一个运动的物体的实时视觉系统和一个能够用钳子来捡起运动的物体的机械手臂之间达到一种高水平的交互作用。这是对动态捕捉中手眼相互影响的深入研究,这样的动态捕捉在现实中例如:搬运系统中的捕捉、装配中的铰接部分、还有遥控机器人系统中的捕捉部分。一个有机的感觉系统和电动机控制之间的协调是一个智能系统的最大的特点。我们的目标是要建立完整的感觉反应系统——它能够在动态而不是静态环境下运作。我们所建立的机器人手-眼协同系统在捕捉动态物体的过程中有三个显著的特点:对视觉系统采集的三维参数的快速计算、对移动机械手臂跟踪动态物体的预先控制以及对运动的火车的动态跟踪的实验演示。我们所运用的关于感觉和反应方面的运算法则是很全面的以及能够应用到要求对手臂控制有视觉反馈的变化多样的复杂任务中。
1 绪论
我们这次研究的焦点是如何在一个能够在三维上跟踪一个运动的物体的实时视觉系统和一个装配了灵巧的手能够进行截取、捕捉和捡取运动物体的机械手臂之间建立一种高水平的交互作用。我们的兴趣是对动态捕捉中手眼相互影响的深入研究,这样的动态捕捉在现实中例如:搬运系统中的捕捉、装配中的铰接部分、还有遥控机器人系统中的捕捉部分。一个有机的感觉系统和电动机控制之间
的协调是一个智能系统的最大的特点。我们的目标是要建立完整的感觉反应系统——它能够在动态而不是静态环境下运作。在过去的几年里有许多对机器人的研究,他们有的是研究对动态物体的视觉捕捉,有的是研究不能适应特殊环境的捕捉问题,但是很少有文章把两个问题结合起来考虑。很明显一个复杂的机器人系统如自动装配系统要求具有用视觉反馈来进行计划、执行和监视捕捉的完整系统。
我们所建立的机器人手-眼协同系统在捕捉动态物体的过程中有三个显著的特点:对视觉系统采集的三维参数的快速计算、对移动机械手臂跟踪、截取、捕捉动态物体的预先控制。这个系统能够以近似于人类手臂运动的速度来操纵,运用视觉反馈来跟踪、截取、稳定的捕捉和拾取动态的物体。我们所运用的关于感觉和反应方面的运算法则是很全面的以及能够应用到要求对手臂控制有视觉反馈的变化多样的复杂任务中。
我们的工作也忙于所有的完整的感觉-反应系统以及有不同取样和处理速率的综合系统所固有的最基本和限制问题。许多复杂的机器人系统事实上都有不同的处理设备和方法。例如,我们设计的系统包括三个独立的计算系统,一个图象处理系统,一个主机来对原始的视觉数据进行过滤、三角计算和预测物体的空间位置,一个独立的手臂控制系统来执行反转运动和连接水平上的伺服系统。任何一个系统都有它们独立的取样频率、噪声特性和光滑稳定的处理延迟。在我们的案例里,我们主要专注于通过依靠对系统噪声特性的概略分析来进行预先过滤从而来克服视觉处理噪音和延迟。另外,实时的手臂控制系统要求无论对物体位置新的预测是否有用都能够有很快的伺服速率。
这个系统有两个可以拍摄移动物体的固定相机,还有一个平口虎钳可以用来拾取运动的物体。系统的运行过程如下:
(1)系统的影射对图象的每个相素进行光学流动计算。通过这些光学流动区域,我们通过对每个动作的能线图进行三角测量后可以以确定每个运动物体的空间位置。
(2)对第一步获得的初始空间位置进行稍微的传感器噪音滤处,然后对轨道参数进行非线性的滤波使其能够用来进行迅速的预测,最后再把较正后的轨道参数送给机械手系统。
(3)轨道设计者通过动态平衡方程式来更新手臂伺服系统的接合标准。为了避免对光学和滤波系统信息的遗漏和延迟,又附加了一个固定的滤波器。
(4)一旦跟踪稳定了,系统将命令手臂去截取运动的物体,同时机械手将进行稳定的捕捉和拾取。
接下来的部分将结合实验结果对每个子系统进行详细的介绍。
2 前期工作
由于前期工作太多而无法一一列举,我们只列举了一些对我们工作有提示作用的让我们采用了相似系统的文献。Burt et al. 专注与研究高速容貌探测和对图象的不同缩放比例来满足监视的实时性和其他机器人系统的应用。相关的工作已经被Lee 、Wohn 、Wiklund 和 Granlund所报道了,他们用不同的图象处理方法来跟踪物体。Corke,、Paul和 Wohn发表了一种重要的跟踪方法,它用一种有特殊用途的硬件来驱动安装在手臂系统上的相机的伺服系统。Goldenberg et al也发明了一种方法,它用的是和我们有相似视觉硬件的时间滤波器。Luo、 Mullen 和Wessel 在Horn Schun的基础上发明了一种在一维上跟踪运动物体的方法。Verghese et ul. 发明了一种和我们有相似传递途径的实时的短距离视觉跟踪运动物体的方法。Safadi 研发了一种以棱镜为基础的光学系统,该方法和我们有相似的跟踪滤波系统。Rao 和Durrant-Whyte研发了一种基于分散跟踪的 Kalman滤波系统,该系统是用若干个相机来跟踪物体的。Miller采用了一种综合运用相机和手臂来跟踪物体,该系统的重点是研究系统的运动控制参数。Weiss et al. 也通过视觉反馈足进了系统控制理论上的飞跃。Brown研发了一固定的控制系统,他在运动的控制系统上安装了有两个监视孔的相机从而实现了对运动物体的固定监视。Clark and Ferrier也研发了一种对移动机器人进行固定控制的系统。在跟踪上的新突破就是出现了用移动相机来进行跟踪。
在动态跟踪的控制方面的主要著作大都是关于稳定性的、日益更新的对不同视觉运算法则的噪声轨迹和输出延迟。我们前期也用了同样的方法来处理这个问题,用了一个
滤波器,它是一个稳态Kalman滤波器。在Papanikolopoulos
et al. 的著作中视觉感受器被用在反馈循环中来适应机器人的视觉跟踪。经典控制系统由一个Kalman滤波预算和一个用理想控制来估计机器人的运动。当实时性比较高以及要求精确跟踪时,视觉系统经常用SSD来进行光学估计。有效的利用图象中的窗口能够更好的发挥这种方法的作用。作者已经介绍了更好的跟踪方法,并且要求控制器要有足够的能量,因此复杂的系统被证明是不适合的。CMU Direct Drive Arm I1的实验结果证明这种方法还是很精确、还是很有前途的。
Lee和Kay主要致力于由于机器人结构调整而引起的不确定性问题。事实上要把机器人结构上每个相机都准确的固定是见很烦人的事,因此每次在枯燥的环境下工作时它都要被移动。其次,要在笛卡尔坐标系中确定运动物体的位置和移动方向和假设一个简单的误差模拟。安徒生采用了一个Kalman滤波器来让机器人完成迷宫游戏。在Houshangi 和 Koivo et al. 工作中采用了一种稍微有点不同的逼近方法。在这些工作中AR和ARMAX被用来研究视觉跟踪系统。
三 视觉系统
在视觉跟踪问题上,系统的映射中的动作需要转化为三维空间上的动作。我们的方法是通过最初估计局部的光学流场来测量图像中的每个相素的转换速率。在处理光学流场时我们用了许多技术,这些技术包括匹配技术、坡度技术以及时空能量法。光学流场选择以古老的跟踪运算法则为基础的原因如下:
(1)在空间上跟踪一个物体意味着将在那个能够成像的视网膜形成一个动作图像。通过对每个相机上图像的鉴别,我们就能够确定物体的真实空间位置。
(2)成像过程中主要的限制就是必须要有很高的计算速度来满足机械手臂参数的更新。因此,我们要能够快速地粗略地估计物体的移动。Hom-Schunck的光学流场运算法则就很适合我们PIPE成像过程中的运算估计。
(3)我们发明了一种新的框架通过理论上的框架来粗略地处理光学流。虽然这个项目还不能够明确的使用这种方法,但是我们打算在将来的运算法则中采用这中框架。
我们的方法是以用Hom-Schunck的运算法则来处理光学流场为开头的。这种方法潜在的假设是光学流场方程式,它认为在时间T和T+t两个时刻图像的亮度是一样的:
如果我们用泰勒公式对方程式展开,并且忽略掉二阶及高阶项后,我们将获得如下的式子:
其中u和v是象空间的速率,Ix、Iy和It是空间和时间上的坐标,这个约束条件限制了速度在速率场上为线性变化。由于相机的孔径问题我们不能够直接通过约束条件来确定实际速率,但可以由这个线性公式重新获取一些正常速率的参数。基于一种光滑而有启发性的约束条件,重复的过程经常被用来加快图像的速率。这些附加的约束条件被用来恢复图像中的实际速度u、v。尽管计算的很逼真,但这种确定光学流速率的方法仍有其固有的缺点。首先,这种估计是建立在单独相素的基础上的,要求进行大量的计算。其次,光学流的所有信息都只适用于坡
度存在的条件下。
我们通过运用PIPE图像处理机已经克服了第一个问题。PIPE图象处理机是一个可以以桢频的速率来平行处理多重256*256*8比特图像的计算机,并且它支持以桢频的速率来平行处理图像要必需的操作。由于我们不需要知道图像中的实际速率,因此也降低了第二个问题的难度。因此,一些很大坡度的正常速度就可以满足我们的要求了,从而减少了对图像重复加速的要求。
A、 实时的处理正常的光学流速度
我们的目的是实时的跟踪一个单独的运动物体。我们用两个固定的摄像机进行摄像并将物体的空间位置汇报给机械手控制系统。每个相机都通过三维图像进行校正,但这并不要求必须是标准的相机。我们通过计算每个相机中的每个图像的常规构成,找到每个图像中每个动作的质心,然后用三角测量法来确定每个图像背面的质心。四个处理机被平行的用在PIPE上。处理机被分配到两个相机上,其中的两个被用来计算图像在X和Y方向上的质心坐标。我们也用了一个特殊的处理机来完成实时的柱状图,以下的步骤与图3的数字相对应: 1)照相机进行摄像并将图像传输给PIPE的处理系统。
2)通过高斯公式来圆滑图像,盘形的执行元件被内置在PIPE内部,并且可以在一桢的循环中运行。
3-4)在接下来的两个循环中,更多的图像被平稳的、光滑的读入。对图像的缓冲和传递实现虽然引起了对输出的延迟但它能够实现对图像的通俗操作。在 PIPE中运用了三个流畅的摄像系统。 5)映像Io和I1相减以后获得了暂时的It。
6)与步骤五相比,图像I1周围缠了一圈3 x 3水平的梯度算子,反馈回来Ix方向上的离散信号,同时垂直方向上的梯度也可以通过Iy来算出。
7-8)步骤5和6的结果被保存到了磁盘中,然后再输入一个查询表格,它通过对
水平方向和垂直方向的综合估计把时间轴分成了每个像素。这就使得每个图像的各个像素都有了一个正常的播放速度。这些速度都已经得到了极限,并且图像上的任何一个孤立的斑点都是由形态学上的饿腐蚀引起的。以上的极限速率被翻译成了灰度为255的编码。在我们的实验中,我们以每60秒10桢的速率来播放。
9-10)为了确定运动物体的质心,我们需要综合X和Y方向上的信息。简而言之,
我们只能够知道X方向的坐标。在图三的灰度值中把水平方向的坐标分成了
0-255。因此,图像在水平坐标为0时是黑色,在水平坐标为255时是白色。 11)借助于这个结果并将他们用柱状图表示出来,我们通过了PIPE的一个特殊阶段,这个阶段以桢频的速率展现了柱状图,我们通过对柱状图的分析能够得到运动物体的质心。柱状图以很高的速率复显了图像,这些分为256阶段的信息能够在10秒内通过并行接口输入到PIPE内。并且同时也就获得了Y方向上的坐标了。要确定一个运动物体的位置所延迟的时间是15个循环周期也就是0.25秒。
第二个相机也运用同样的运算法则在PIPE的并行接口上工作。一旦任何一个相机找到了运动物体的质心,它们将通过相机的校准反馈到屏幕上,并且通过三角测量来确定运动物体的空间位置。由于PIPE传递管道的自然特性,每1/60秒将产生一个新的X和Y坐标。尽管我们能够以实时的速率来确定物体的空间位置,但在视觉系统中仍然有大量的噪声和固定误差以及由于有不同的监视系统所带来的微小偏移。接下来带来的影响就是创造了新的空间位置坐标使我们在大体上确定了物体的位置但却不能够精确的跟踪物体,从而也就不能够对物体进行捕捉。在接下来的部分我们论说了一种概然的模型,它能够滤掉更多的噪音从而能够更加稳定的、精确的确定物体的空间位置。
相关推荐: