sql*load批量导入Oracle数据库

SQL*LOADER是大型数据

　　仓库选择使用的加载方法，因为它提供了最快速的途径（DIRECT，PARALLEL）。现在，我们抛开其理论不谈，用实例来使

　　您快速掌握SQL*LOADER的使用方法。

　　首先，我们认识一下SQL*LOADER。

　　在NT下，SQL*LOADER的命令为SQLLDR，在UNIX下一般为sqlldr/sqlload。

　　如执行：d:\oracle>sqlldr

　　SQL*Loader: Release 8.1.6.0.0 - Production on 星期二 1月 8 11:06:42 2002

　　用法: SQLLOAD 关键字 = 值 [,keyword=value,...]

　　有效的关键字:

　　userid -- ORACLE username/password

　　control -- Control file name

　　log -- Log file name

　　bad -- Bad file name

　　data -- Data file name

　　discard -- Discard file name

　　discardmax -- Number of discards to allow (全部默认)

　　skip -- Number of logical records to skip (默认0)

　　load -- Number of logical records to load (全部默认)

　　errors -- Number of errors to allow (默认50)

　　rows -- Number of rows in conventional path bind array or between direct p

　　ath data saves

　　（默认: 常规路径 64, 所有直接路径）

　　bindsize -- Size of conventional path bind array in bytes(默认65536)

　　silent -- Supdivss messages during run (header,feedback,errors,discards,part

　　itions)

　　direct -- use direct path (默认FALSE)

　　parfile -- parameter file: name of file that contains parameter specification

　　parallel -- do parallel load (默认FALSE)

　　file -- File to allocate extents from

　　skip_unusable_indexes -- disallow/allow unusable indexes or index partitions(默认FALSE)

　　skip_index_maintenance -- do not maintain indexes, mark affected indexes as unusable(默认FALSE)

　　commit_discontinued -- commit loaded rows when load is discontinued(默认FALSE)

　　readsize -- Size of Read buffer (默认1048576)

　　PLEASE NOTE: 命令行参数可以由位置或关键字指定。

　　前者的例子是 'sqlload scott/tiger foo';

　　后者的例子是 'sqlload control=foo userid=scott/tiger'.

　　位置指定参数的时间必须早于但不可迟于由关键字指定的参数。例如,

　　'SQLLOAD SCott/tiger control=foo logfile=log',

　　但'不允许 sqlload scott/tiger control=foo log',

　　即使允许参数 'log' 的位置正确。

　　d:\oracle>

　　我们可以从中看到一些基本的帮助信息，这里，我用到的是中文的WIN2000　ADV　SERVER。

　　我们知道，SQL*LOADER只能导入纯文本，所以我们现在开始以实例来讲解其用法。

　　一、已存在数据源result.csv，欲倒入ORACLE中FANCY用户下。

　　result.csv内容：

　　1,默认 Web 站点,192.168.2.254:80:,RUNNING

　　2,other,192.168.2.254:80:test.com,STOPPED

　　3,third,192.168.2.254:81:thirdabc.com,RUNNING

　　从中，我们看出4列，分别以逗号分隔，为变长字符串。

　　二、制定控制文件result.ctl

　　result.ctl内容：

　　load data

　　infile 'result.csv'

　　into table resultxt

　　(resultid char terminated by ',',

　　website char terminated by ',',

　　ipport char terminated by ',',

　　status char terminated by whitespace)

　　说明：

　　infile　指数据源文件　这里我们省略了默认的　discardfile result.dsc badfile result.bad

　　into table resultxt 默认是INSERT，也可以into table resultxt APPEND为追加方式，或REPLACE

　　terminated by ','　指用逗号分隔

　　terminated by whitespace　结尾以空白分隔

　　三、此时我们执行加载：

　　D:\>sqlldr userid=fancy/testpass control=result.ctl log=resulthis.out

　　SQL*Loader: Release 8.1.6.0.0 - Production on 星期二 1月 8 10:25:42 2002

　　SQL*Loader-941: 在描述表RESULTXT时出现错误

　　ORA-04043: 对象 RESULTXT 不存在

　　提示出错，因为数据库没有对应的表。

　　四、在数据库建立表

　　create table resultxt

　　(resultid varchar2(500),

　　website varchar2(500),

　　ipport varchar2(500),

　　status varchar2(500))

　　五、重新执行加载

　　D:\>sqlldr userid=fancy/k1i7l6l8 control=result.ctl log=resulthis.out

　　SQL*Loader: Release 8.1.6.0.0 - Production on 星期二 1月 8 10:31:57 2002

　　达到提交点，逻辑记录计数2

　　达到提交点，逻辑记录计数3

　　已经成功！我们可以通过日志文件来分析其过程：resulthis.out内容如下：

　　SQL*Loader: Release 8.1.6.0.0 - Production on 星期二 1月 8 10:31:57 2002

　　控制文件: result.ctl

　　数据文件: result.csv

　　错误文件: result.bad

　　废弃文件: 未作指定

　　(可废弃所有记录)

　　装载数: ALL

　　跳过数: 0

　　允许的错误: 50

　　绑定数组: 64 行，最大 65536 字节

　　继续: 未作指定

　　所用路径: 常规

　　表RESULTXT

　　已载入从每个逻辑记录

　　插入选项对此表INSERT生效

　　列名位置长度中止包装数据类型

　　------------------------------ ---------- ----- ---- ---- ---------------------

　　RESULTID FIRST * , CHARACTER

　　WEBSITE NEXT * , CHARACTER

　　IPPORT NEXT * , CHARACTER

　　STATUS NEXT * WHT CHARACTER

　　表RESULTXT:

　　3 行载入成功

　　由于数据错误, 0 行没有载入。

　　由于所有 WHEN 子句失败, 0 行没有载入。

　　由于所有字段都为空的, 0 行没有载入。

　　为结合数组分配的空间: 65016字节（63行）

　　除绑定数组外的内存空间分配: 0字节

　　跳过的逻辑记录总数: 0

　　读取的逻辑记录总数: 3

　　拒绝的逻辑记录总数: 0

　　废弃的逻辑记录总数: 0

　　从星期二 1月 08 10:31:57 2002开始运行

　　在星期二 1月 08 10:32:00 2002处运行结束

　　经过时间为: 00: 00: 02.70

　　CPU 时间为: 00: 00: 00.10(可

　　六、并发操作

　　sqlldr userid=/ control=result1.ctl direct=true parallel=true

　　sqlldr userid=/ control=result2.ctl direct=true parallel=true

　　当加载大量数据时（大约超过10GB），最好抑制日志的产生：

　　SQL>ALTER TABLE RESULTXT nologging;

　　这样不产生REDO　LOG，可以提高效率。然后在CONTROL文件中load data上面加一行：unrecoverable

　　此选项必须要与DIRECT共同应用。

　　在并发操作时，ORACLE声称可以达到每小时处理100GB数据的能力！其实，估计能到1－10G就算不错了，开始可用结构

　　相同的文件，但只有少量数据，成功后开始加载大量数据，这样可以避免时间的浪费。

　　我的示例：

　　一、在数据库建立表格weather如下：

　　create table weather(

　　outlook varchar(20),

　　temperature float,

　　humidity float,

　　windy varchar(10),

　　play varchar(10)

　　)

　　二、在F盘建立两个文件分别如下：

　　#1、result.ctl内容如下：

　　load data

　　infile 'result.csv'

　　into table weather

　　(outlook char terminated by ',',

　　temperature char terminated by ',',

　　humidity char terminated by ',',

　　windy char terminated by ',',

　　play char terminated by ','

　　)

　　#2、result.csv内容如下：

　　sunny,85,85,FALSE,no

　　sunny,80,90,TRUE,no

　　overcast,83,86,FALSE,yes

　　rainy,70,96,FALSE,yes

　　rainy,68,80,FALSE,yes

　　rainy,65,70,TRUE,no

　　overcast,64,65,TRUE,yes

　　sunny,72,95,FALSE,no

　　sunny,69,70,FALSE,yes

　　rainy,75,80,FALSE,yes

　　sunny,75,70,TRUE,yes

　　overcast,72,90,TRUE,yes

　　overcast,81,75,FALSE,yes

　　rainy,71,91,TRUE,no

　　三、命令行下执行

　　F:\>sqlldr userid=cqsb/ctbujx123 control=result.ctl

　　四、搞定

«
»

PHP教程

PHP简介 PHP基本语法 PHP类型 PHP变量 PHP运算符 PHP控制结构 PHP函数 PHP类与对象 PHP异常处理函数库分类